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(54) TiUe: SOLID PHASE SEQUENCING OF BIOPOLYMERS 
(57) Abstract 



This invention relates to methods for detecting and se- 
quencing target nucleic acid sequences, and double-stranded 
nucleic acid sequences, to nucleic acid probes, lojnasyaflsL. 
jlciinucleica^ to arrays of probes useful in these 

methods and to kits and systems which contain these probes. 
Useful methods involve hybridizing the nucleic acids or nu- 
cleic acids which represent complementary or homologous 
sequences of the target (o an array of nucleic acid probes. 
These probes comprise a single -stranded portion, an optional 
double -stranded portion and a variable sequence within the 
single -stranded portion. The molecular weights of the hy- 
brid i^ed nucleic acids of the set can be determined by mass 
spectroscopy, and the sequence of the target determined from 
the molecular weights of the fragments. Nucleic acids whose 
sequences can be determined include DNA or RNA in bio- 
logical samples such as patient biopsies and environmental 
samples. Probes may be fixed to a solid support siich as a 
hybridization chip to facilitate automated molecular weight 
analysis and identification of the target sequence. 
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SOLID PHASE SEQUENCING OF BIOPOLYMERS 

Rights in fnyrnTirrn 

This invention was made with United States Government 
support under grant number DE.FG-02.93ER6 1 609, awarded by the United 
5 States Department of Energy, and the United States Government has certain 
rights in the invention. 

BacktrrnnnH »fth^ Tn-'mTirn 

^ - • - Field of the Invention 

This invention relates to methods for detecting and sequencing 
1 0 nucleic acids using sequencing by hybridization technology and molecular 
weight analysis. The invention also relates to probes and arrays useftil in 
sequencing and detection and to kits and apparanis for determining sequence 
information. 

2. Description of the Background 

Since the recognition of nucleic acid as the carrier of the 
genetic code, a great deal of interest has centered around determining the 
sequence of that code in the many forms which it is found. Two landmark 
studies made the process of nucleic acid sequencing, at least with DNA. a 
common and relatively rapid procedure practiced in most laboratories. The 
20 first describes a process whereby terminally labeled DNA molecules are 
chemically cleaved at single base repetitions (A.M. Maxam and W. Gilbert. 
Proc. Natl. Acad. Sci. USA 74:560-64, 1977). Each base position in the 
nucleic acid sequence is then determined from the molecular weights of 
fragments produced by partial cleavages. Individual reactions were devised 
25 to cleave preferentially at guanine, at adenine, at cytosine and thymine at 
cytosine alone. When the products of these four reactions are resolved b%- 
molecular weight, using, for example, polyacrylamide gel electrophoresis. 



15 
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DNA sequences can be read from the pattern of fragments on the resolved 
gel. 

The second study describes a procedure whereby DNA is 
sequenced using a variation of the plus-minus method (F, Sanger et al., Proc, 
5 Natl. Acad. Sci. USA 74:5463-67, 1977). This procedure takes advantage 
of the chain terminating ability of dideoxynucleoside triphosphates 
(ddNTPs) and the ability of DNA polymerase to incorporate ddNTPs with 
nearly equal fidelity as the natural substrate of DNA polymerase, 
deoxynucleosides triphosphates (dNTPs). Briefly, a primer, usually an 

1 0 oligonucleotide, and a template DNA are incubated together in the presence 
of a useful concentration of all four dNTPs plus a limited amount of a single 
ddNTP. The DNA polymerase occasionally incorporates a 
dideoxynucleotide which terminates chain extension. Because the 
didcoxynucleotide has no 3'-hydroxyl, the initiation point for the polymerase 

1 5 en2yme is lost. Polymerization produces a mixture of fragments of varied 
sizes, all having identical 3* termini. Fractionation of the mixture by, for 
example, polyacrylamide gel electrophoresis, produces a pattern which 
indicates the presence and position of each base in the nucleic acid. 
Reactions with each of the four ddNTPs allows one of ordinar>' skill to read 

20 an entire nucleic acid sequence from a resolved gel. 

Despite their advantages, these procedures are cumbersome 
and impractical when one wishes to obtain megabases of sequence 
information. Further, these procedures are, for all practical purposes, 
limited to sequencing DNA. Although variations have developed, it is still 

25 not possible using either process to obtain sequence information directly 
from any other form of nucleic acid. 
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A relatively new method for obtaining sequence information 
from a nucleic acid has recently been developed whereby the sequences of 
groups of contiguous bases are determined simultaneously. In comparison 
to traditional techniques whereby one determines base specific information 
5 of a sequence individually, this method, referred to as sequencing by 
hybridization (SBH), represents a many-fold amplification in speed. Due, 
at least in part to the increased speed, SBH presents numerous advantages 
including reduced expense and greater accuracy. Two general approaches 
of sequencing by hybridization have been suggested and their practicah'ty 

10 has been demonstrated in pilot studies. In one format, a complete set of 4" 
nucleotides of length n is immobilized as an ordered array on a solid support 
and an unknown DNA sequence is hybridized to this array (K.R, Khrapko 
et ah, J, DNA Sequencing and Mapping 1:375-88, 1991), The resulting 
hybridization pattern provides all "/7-tuple" words in the sequence. This is 

1 5 sufficient to determine short sequences except for simple tandem repeats. 

In the second format, an array of immobilized samples is 
hybridized with one short oligonucleotide at a time (Z. Strezoska et al., Proc, 
Natl. Acad. Sci. USA 88: 10,089-93. 1991 ), When repeated 4" times for each 
oligonucleotide of length n, much of the sequence of all the immobilized 

20 samples would be determined. In both approaches, the intrinsic power of 
the method is that many sequenced regions are determined in parallel. In 
actual practice the array size is about 1 0"* to 10^ 

Another aspect of the method is that information obtained is 
quite redundant, and especially as the size of the nucleic acid probe grows. 

25 Mathematical simulations have shown that the method is quite resistant to 
experimental errors and that far fewer than all probes are necessar>' to 
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determine reliable sequence data (P. A. Pevzner et al., J. Biomoh Struc. & 
Dyn. 9:399^10, 1991; W. Bains, Genomics 1 1:295-301, 1991). 

In spite of an overall optimistic outlook, there are still a 
number of potentially severe drawbacks to actual implementation of 
5 sequencing by hybridization. First and foremost among these is that 4" 
rapidly becomes quite a large number if chemical synthesis of all of the 
oligonucleotide probes is actually contemplated. Various schemes of 
automating this synthesis and compressing the products into a small scale 
array, a sequencing chip, have been proposed. 

10 There is also a poor level of discrimination between a 

correctly hybridized, perfectly matched duplexes, and end mismatches. In 
part, these drawbacks have been addressed at least to a small degree by the 
method of continuous stacking hybridization as reported by a Khrapko et al. 
(FEES Lett. 256:118-22, 1989). Continuous stacking hybridization is based 

15 upon the observation that when a single-stranded oligonucleotide is 
hybridized adjacent to a double-siranded oligonucleotide, the two duplexes 
are mutually stabilized as if they are positioned side-to-side due to a 
stacking contact between them. The stability of the interaction decreases 
significantly as stacking is disrupted by nucleotide displacement, gap or 

20 terminal mismatch. Internal mismatches are presumably ignorable because 
their thermodynamic stability is so much less than perfect matches. 
Although promising, a related problem arises which is the inability to 
distinguish between weak, but correct duplex formation, and simple 
background such as non-specific adsorption of probes to the underlying 

25 suppon matrix. 
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Detection is also monochromatic wherein separate sequential 
positive and negative controls must be run to discriminate between a correct 
hybridization match, a mis-match, and background. All too often, 
ambiguities develop in reading sequences longer than a few hundred base 
5 pairs on account of sequence recurrences. For example, if a sequence one 
base shorter than the probe recurs three times in the target, the sequence 
position cannot be uniquely determined. The locations of these sequence 
ambiguities are called branch points. 

Secondary structures often develop in the target nucleic acid 

10 affecting accessibility of the sequences. This could lead to blocks of 
sequences that are unreadable if the secondary structure is more stable than 
occurs on the complementary strand. 

A final drawback is the possibility that cenain probes will 
have anomalous behavior and for one reason or another, be recalcitrant to 

15 hybridization under whatever standard sets of conditions ultimately used. 
A simple example of this is the difficulty in finding matching conditions for 
probes rich in G/C content. A more complex example could be sequences 
with a high propensity to form triple helices. The only way to rigorously 
explore these possibilities is to carry out extensive hybridization studies with 

20 all possible oligonucleotides of length "n" under the particular format and 
conditions chosen. This is clearly impractical if many sets of conditions are 
involved. 

Among the early publication which appeared discussing 
sequencing by hybridization, E.M. Southern (WO 89/10977). described 
25 methods whereby unknown, or target, nucleic acids are labeled, hybridized 
to a set of nucleotides of chosen length on a solid support, and the nucleotide 
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sequence of the target determined, at least partially, from knowledge of the 
sequence of the bound fragments and the pattern of hybridization observed. 
Although promising, as a practical matter, this method has numerous 
drawbacks. Probes are entirely single-stranded and binding stability is 
5 dependent upon the size of the duplex. However, every additional 
nucleotide of the probe necessarily increases the size of the array by four 
fold creating a dichotomy which severely restricts its plausible use. Further, 
there is an inability to deal with branch point ambiguities or secondary 
structure of the target, and hybridization conditions will have to be tailored 

10 or in some way accounted for each binding event. Attempts have been made 
to overcome or circumvent these problems. 

R. Drmanac et al, (U.S. Patent No. 5,202,231) is directed to 
methods for sequencing by hybridization using sets of oligonucleotide 
probes with random or variable sequences. These probes, although useful, 

15 suffer from some of the same drawbacks as the methodology of Southern 
(1989), and like Southern, fail to recognize the advantages of stacking 
interactions. 

K.R. Khrapko et al. (FEBS Lett. 256; 1 18-22, 1989; and J. 
DNA Sequencing and Mapping 1:357-88, 1991) attempt to address some of 

20 these problems using a technique referred to as continuous stacking 
hybridization. With continuous stacking, conceptually, the entire sequence 
of a target nucleic acid can be determined. Basically, the target is 
hybridized to an array of probes, again single-stranded, denatured from the 
array* and the dissociation kinetics of denaturation analyzed to deteirnine the 

25 target sequence. Although also promising, discrimination between matches 
and ihis-matches (and simple background) is low and, further, as 
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hybridis^ation conditions are inconstant for each duplex, discrimination 
becomes increasingly reduced with increasing target complexity. 

Another major problem with current sequencing formats is the 
inability to efficiently detect sequence infomiation. In conventional 
5 procedures, individual sequences are separated by, for example, 
electrophoresis using capillary or slab gels. This step is slow, expensive and 
requires the talents of a number of highly trained individuals, and, more 
importantly, is prone to error. One attempt to overcome these difficulties 
has been to utilize the technology of mass spectrometry. 

10 Mass spectrometry of organic molecules was made possible 

by the development of instruments able to volatize large varieties of organic 
compounds and by the discovery that the molecular ion formed by 
volatization breaks down into charged fragments whose structures can be 
related to the intact molecule. Although the process itself is relatively 

15 straight forward, actual implementation is quite complex. Briefly, the 
sample molecule or anaKte is volatized and the resulting vapor passed into 
an ion chamber where it is bombarded with electrons accelerated to a 
compatible energy level. Electron bombardment ionizes the molecules of 
the sample analyte and then directs the ions formed to a mass analyzer. The 

20 mass analyzer, with its combination of electrical and magnetic fields, 
separates impacting ions according to their mass/charge (m/e) ratios. From 
these ratios, the molecular weights of the impacting ions can be determined 
and the structure and molecular weight of the analyte determined. The 
entire process requires less than about 20 microseconds. 

25 Attempts to apply mass spectrometr\' to the analysis of 

biomolecules such as proteins and nucleic acids have been disappointing. 
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Mass spectrometric analysis has traditionally been limited to molecules with 
molecular weights of a few thousand daltons. At higher molecular weights, 
samples become increasingly difficult to volatize and large polar molecules 
generally cannot be vaporized without catastrophic consequences. The 
5 energy requirement is so significant that the molecule is destroyed or, even 
worse, fragmented. Mass spectra of fragmented molecules are often 
difFiciilt or impossible to read. Fragment linking order, particularly useful 
for reconstructing a molecular structure, has been lost in the fragmentation 
process. Both signal to noise ratio and resolution are significantly 

10 negatively affected. In addition, and specifically with regard to 
biomolecular sequencing, extreme sensitivity is necessary to detect the 
single base differences between biomolecular polymers to determine 
sequence identity. 

A number of new methods have been developed based on the 

15 idea that heat, if applied with sufficient rapidity, will vaporize the sample 
biomolecule before decomposition has an opportunity to take place. This 
rapid heating technique is referred to as plasma desorption and there are 
many variations. For example, one method of plasma desorption involves 
placing a radioactive isotope such as CaIifomium-252 on the surface of a 

20 sample analyte which forms a blob of plasma. From this plasma, a few ions 
of the sample molecule will emerge intact. Field desorption ionization, 
another form of desorption, utilizes strong electrostatic fields to literally 
extract ions from a substrate. In secondary ionization mass spectrometr>' or 
fast ion bombardmenu an analyte surface is bombarded with electrons which 

25 encourage the release of intact ions. Fast atom bombardment involves 
bombarding a surface with accelerated ions which are neutralized by a 
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charge exchange before they hit the surface. Presumably, neutralization of 
the charge lessens the probability of molecular destruction, but not the 
creation of ionic forms of the sample. In laser desorption, photons comprise 
the vehicle for depositing energy on the surface to volatize and ionize 
5 molecules of the sample. Each of these techniques has had some measure 
of success with different types of sample molecules. Recently, there have 
also been a variety of techniques and combinations of techniques 
specifically directed to the analysis of nucleic acids. 

Brennan et ah used nuclide markers to identify terminal 

10 nucleotides in a DNA sequence by mass spectrometr>' (U.S. Patent No, 
5,003,059). Stable nuclides, detectable by mass spectrometry, were placed 
in each of the four dideoxynucleotides used as reagents to polymerize cDNA 
copies of the target DNA sequence. Polymerized copies were separated 
electrophoretically by size and the terminal nucleotide identified by the 

1 5 presence of the unique label. 

Fenn et aL describes a process for the production of a mass 
spectrum containing a multiplicity of peaks (U.S. Patent No. 5 J 30,538). 
Peak components comprised multiply charged ions formed by dispersing a 
solution containing an analyte into a bath gas of highly charged droplets. 

20 An electrostatic field charged the surface of the solution and dispersed the 
liquid into a spray referred to as an electrospray (ES) of charged droplets. 
This nebulization provided a high charge/mass ratio for the droplets 
increasing the upper limit of volatization. Detection was still limited to less 
than about 100,000 daltons. 

25 Jacobson et al utilizes mass spectrometn- to analyze a DNA 

sequence by incorporating stable isotopes into the sequence (U.S. Patent No. 
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5,002,868)- Incorporation required the steps of enzymatically introducing 
the isotope into a strand of DNA at a terminus, electrophoretically 
separating the strands to determine fragment size and analyzing the 
separated strand by mass spectrometry. Although accuracy was stated to 
5 have been increased, electrophoresis was necessary to isolate the labeled 
strand. 

Brennan also utilized stable markers to label the terminal 
nucleotides in a nucleic acid sequence, but added the step of completely 
degrading the components of the sample prior to analysis (U.S, Patent Nos. 

10 5,003,059 and 5,174,962). Nuclide markers, enzymatically incorporated 
into either dideoxynucleotides or nucleic acid primers, were 
electrophoretically separated. Bands were collected and subjected to 
combustion and passed through a mass spectrometer. Combustion converts 
the DNA into oxides of carbon, hydrogen, nitrogen and phosphorous, and 

15 the label into sulfur dioxide. Labeled combustion products were identified 
and the mass of the initial molecule reconstructed. Although fairly accurate, 
the process does not lend itself to large scale sequencing of biopolymers. 

A recent advancement in the mass spectrometric analysis of 
high molecular weight molecules in biologv' has been the development of 

20 time of flight mass spectrometry (TOF-MS) with matrix-assisted laser 
desorption ionization (MALDI). This process involves placing the sample 
into a matrix which contains molecules which assist in the desorption 
process by absorbing energy at the frequency used to desorp the sample. 
The theor>' is that volatization of the matrix molecules encourages 

25 volatization of the sample without significant destruction. Time of flight 
analysis utilizes the travel time or flight time of the various ionic species as 
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an accurate indicator of molecular mass. There have been some notable 
successes with these techniques. 

Beavis et al. proposed to measure the molecular weights of 
DNA fragments in mixtures prepared by either Maxam-Gilbert or Sanger 
5 sequencing techniques (U.S. Patent No. 5,288,644), Each of the different 
DNA fragments to be generated would have a common origin and terminate 
at a particular base along an unknown sequence. The separate mixtures 
would be analyzed by laser desorption time of flight mass spectroscopy to 
detemiine fragment molecular weights. Spectra obtained from each reaction 

10 would be compared using computer algorithms to determine the location of 
each of the four bases and ultimately, the sequence of the fragment. 

Williams et al, utilized a combination of pulsed laser ablation, 
multiphoton ionization and time of flight mass spectrometry. Effective laser 
desorption was accomplished by ablating a frozen film of a solution 

15 containing sample molecules. When ablated, the film produces an 
expanding vapor plume which entrains the intact molecules for analysis by 
mass spectrometry. 

Even more recent developments in mass spectrometr>' have 
farther increased the upper limits of molecular weight detection and 

20 determination. Mass spectrograph systems with reflectors in the flight tube 
have effectively doubled resolution. Reflectors also compensate for errors 
in mass caused by the fact that the ionized/accelerated region of the 
instmment is not a point source, but an area of finite size wherein ions can 
accelerate at any point. Spatial differences between particle the origination 

25 points of the particles, problematic in conventional instruments because 
arrival times at the detector will var>\ are overcome. Particles that spend 
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more time in the accelerating field will also spend more time in the retarding 
field. Therefore, particles emerging from the reflector are mostly 
synchronous, vastly improving resolution. 

Despite these advances, it is still not possible to generate 
5 coordinated spectra representing a continuous sequence. Furthermore, 
throughput is sufficiently slow so as to make these methods impractical for 
large scale analysis of sequence information. 

Summary of the Invention 
10 The present invention overcomes the problems and 

disadvantages associated with current strategies and designs and provides 
methods, kits and apparatus for determining the sequence of target nucleic 
acids. 

One embodiment of the invention is directed to methods for 
15 sequencing a target nucleic acid. A set of nucleic acid fragments containing 
a sequence which is complementary or homologous to a sequence of the 
target is hybridized to an array of nucleic acid probes wherein each probe 
comprises a double-stranded portion, a single-stranded portion and a 
variable sequence within said single-stranded portion, forming a target array 
20 of nucleic acids. Molecular weights for a plurality of nucleic acids of the 
target array are determined and the sequence of the target constructed. 
Nucleic acids of the target, the target sequence, the set and the probes may 
be DNA. RNA or PNA comprising purine, pyrimidine or modified bases. 
The probes may be fixed to a solid support such as a hybridization chip to 
25 facilitate automated determination of molecular weights and identification 
of the target sequence. 
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Another embodiment of the invention is directed to methods 
for sequencing a target nucleic acid. A set of nucleic acid fragments 
containing a sequence which is complementary or homologous to a 
sequence of the target is hybridized to an array of nucleic acid probes 
5 forming a target array containing a plurality of nucleic acid complexes. One 
strand of those probes hybridized by a fragment is extended using the 
fragment as a template. Molecular weights for a plurality of nucleic acids 
of the target array are determined and the sequence of the target constructed. 
Strands can be enzymatically extended using chain terminating and chain 

10 elongating nucleotides. The resulting nested set of nucleic acids represents 
the sequence of the target. 

Another embodiment of the invention is directed to methods 
for detecting a target nucleic acid. A set of nucleic acids complementary to 
a sequence of the target, is hybridized to a fixed array of nucleic acid probes. 

15 The molecular weights of the hybridized nucleic acids are determined by 
mass spectrometry and a sequence of the target can be identified. Target 
nucleic acids may be obtained from biological samples such as patient 
samples wherein detection of the target is indicative of a disorder in the 
patient, such as a genetic defect, a neoplasm or an infection. 

20 Another embodiment of the invention is directed to methods 

for sequencing a target nucleic acid. A sequence of the target is cleaved into 
nucleic acid fragments and the fragments hybridized to an array of nucleic 
acid probes. Fragments are created by enz>'matically or physically cleaving 
the target and the sequence of the fragments is homologous with or 

25 complemeniarv* lo at least a portion of the target sequence. The array is 
attached to a solid support and the molecular weights of the hybridized 
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fragments determined by mass spectrometry. From the molecular weights 
determined, nucleotide sequences of the hybridized fragments are 
determined and a nucleotide sequence of the target can be identified. 

Another embodiment of the invention is directed to methods 
5 for sequencing a target nucleic acid. A set of nucleic acids complementary 
to a sequence of the target is hybridized to an array of single-stranded 
nucleic acid probes wherein each probe comprises a constant sequence and 
a variable sequence and said variable sequence is determinable. The 
molecular weights of the hybridized nucleic acids are determined and the 

10 sequence of said target identified. The array comprises less than or equal to 
about 4^ different probes and R is the length in nucleotides of the variable 
sequence and may be attached to a solid support. 

Another embodiment of the invention is directed to methods 
for sequencing a target nucleic acid by strand-displacement, double-stranded 

1 5 sequencing. A set of partially single-stranded and partially double-stranded 
nucleic acid fragments are provided wherein each fragment contains a 
sequence that corresponds to a sequence of the target. These nucleic acid 
fragments are hybridized to a set of partially single-stranded and partially 
double-stranded nucleic acid probes, via the single-stranded regions of each, 

20 to form a set of fragment/probe complexes. Prior to hybridization, either the 
fragments or the probes may be treated with a phosphorylase to remove 
phosphate groups from the 5'-termini of the nucleic acids. 5'-termini are 
ligated with adjacent 3'-termini of the complex forming a common single 
strand. The complementary unligated strand contains a nick which is 

25 recognized by a nucleic acid polymerase thai initiates strand-displacemeni 
polymerization, extending the unligated strand. Polymerization proceeds. 
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using the ligated strand as a template, in the presence of labeled nucleotides 
such as mass modified nucleotides. The sequence of the target can be 
determined by mass spectrometry from the molecular weights of the 
extended strands. This process can be used to sequence target nucleic acids 
5 and also to identify a single sequence in a mixed background. Selection of 
the species of nucleic acid to be sequenced occurs upon hybridization to the 
probe. As only fragments complementaiy to the single-stranded region of 
the probe will form complexes, only those fragments complexes are 
sequenced. 

10 Another embodiment of the invention is directed to arrays of 

nucleic acid probes. In these arrays, each probe comprises a first strand and 
a second strand wherein the first strand is hybridized to the second strand 
forming a double-stranded portion, a single-stranded portion and a variable 
sequence within the single-stranded ponion. The array may be attached to 

1 5 a solid support such as a material that facilitates volatization of nucleic acids 
for mass spectrometry'. Arrays can be fixed to hybridization chips 
containing less than or equal to about 4^ different probes wherein R is the 
length in nucleotides of the variable sequence. Arrays can be used in 
detection methods and in kits to detect nucleic acid sequences which may 

20 be indicative of a disorder and in sequencing systems such as sequencing b\' 
mass spectrometry. 

Another embodiment of the invention is directed to arrays of 
single-stranded nucleic acid probes wherein each probe of the array 
comprises a constant sequence and a variable sequence which is 

25 determinable. Arrays may be attached to solid supports which comprise 
matrices that facilitate volatization of nucleic acids for mass spectrometr> , 
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Arrays, generated by conventional processes, may be characterized using the 
above methods and replicated in mass for use in nucleic acid detection and 
sequencing systems. 

Another embodiment of the invention is directed to kits for 
5 detecting a sequence of a target nucleic acid. Kits contain arrays of nucleic 
acid probes fixed to a solid support wherein each probe comprises a double- 
stranded portion, a single-stranded portion and a variable sequence within 
said single-stranded portion. The solid support may be, for example, coated 
with a matrix that facilitates volatization of nucleic acids for mass 
10 spectrometry such as an aqueous composition. 

Another embodiment of the invention is directed to mass 
spectrometry systems for the rapid sequencing of nucleic acids. Systems 
comprise a mass spectrometer, a computer with appropriate software and 
probe arrays which can be used to capture and sort nucleic acid sequences 
1 5 for subsequent analysis by mass spectrometry. 

Other embodiments and advantages of the invention are set 
forth, in part, in the description which follows and, in part, will be obvious 
from this description and may be learned from the practice of the invention. 



20 Description of the Drawings 

Figure 1 (A) Schematic of a mass modified nucleic acid primer; and 

(B) primer mass modification moieties. 
Figure 2 (A) Schematic of mass modified nucleoside triphosphate 

elongators and terminators; and (B) nucleoside triphosphate 
25 mass modification moieties. 

Figure 3 List of mass modification moieties. 
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Figure 4 List of mass modification moieties. 

Figure 5 Cleavage site of Mwo 1 indicating bidirectional sequencing. 
Figure 6 Schematic of sequencing strategy after target DNA digestion 

hyTspKl. 

5 Figure 7 Calculated of matched and mismatched complementary 

DNA. 

Figure 8 Replication of a master array. 

Figure 9 Reaction scheme for the covalent attachment of DNA to a 

surface. 

10 Figure 10 Target nucleic acid capture and ligation. 

Figure 1 1 Ligation efficiency of matches as compared to mismatches. 
Figure 12 (A) Ligation of target DNA with probe attached at 5'- 

terminus; and (B) ligation of target DNA with probe attached 

at the 3'-temiinus. 

15 Figure 13 Gel reader sequencing results from primer hybridization 

analysis. 

Figure 14 Mass spectrometry of oligonucleotide ladder. 
Figure 15 Schematic of mass modification by alkylation. 
Figure 16 Mass spectmm of 17-mer target with 0, 1 or 2 mass modified 
20 moieties. 

Figure 17 Schematic of nicked strand displacement sequencing with 

immobilized template. 
Figure 1 8 Analysis of sequencing reaction in the presence and absence 

of single-stranded DNA binding protein. 
25 Figure 19 Schematic of nicked strand displacement sequencing with 

immobilized probe. 
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Figure 20 Results of sequencing performed using DF27-1 as a probe. 
Figure 2 1 Results of sequencing performed using DF27-2 as a probe. 
Figure 22 Results of sequencing performed using DF27-4 as a probe. 
Figure 23 Results of sequencing performed using DF27-5-CY5 as a 

probe. 

Figure 24 Results of sequencing performed using DF27-6-CY5 as a 

probe. 



PgscriptiQn Qf thg Invgntipn 

10 As embodied and broadly described herein, the present 

invention is directed to methods for sequencing a nucleic acid, probe arrays 
useful for sequencing by mass spectrometry and kits and systems which 
comprise these arrays. 

Nucleic acid sequencing, on both a large and small scale, is 

1 5 critical to many aspects of medicine and biolog>* such as, for example, in the 
identification, analysis or diagnosis of diseases and disorders, and in 
determining relationships between living organisms. Conventional 
sequencing techniques rely on a base-by-base identification of the sequence 
using electrophoresis in a semi-solid such as an agarose or polyacrylamide 

20 gel to determine sequence identity. Although attempts have been made to 
apply mass spectrometric analysis to these methods, the two processes are 
not well suited because, at least in part, information is still be gathered in a 
single base format. Sequencing-by-hybridization methodology has 
enhanced the sequencing process and provided a more optimistic outlook for 

25 more rapid sequencing techniques, however, this methodolog>' is no more 
applicable to mass spectrometr>* than traditional sequencing techniques. 
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In contrast, positional sequencing by hybridization (PSBH) 
with its ability to stably bind and discriminate different sequences with large 
or small arrays of probes is well suited to mass spectrometric analysis. 
Sequence information is rapidly determined in batches and with a minimum 
5 of effort. Such processes can be used for both sequencing unknown nucleic 
acids and for detecting known sequences whose presence may be an 
indicators of a disease or contamination. Additionally, these processes can 
be utilized to create coordinated patterns of probe arrays with known 
sequences. Determination of the sequence of fragments hybridized to the 
1 0 probes also reveals the sequence of the probe. These processes are currently 
not possible with conventional techniques and, ftirther, a coordinated batch- 
type analysis provides a significant increase in sequencing speed and 
accuracy which is expected to be required for effective large scale 
sequencing operations. 



15 



PSBH is also well suited to nucleic acid analysis wherein 
sequence information is not obtained directly from hybridization. Sequence 
information can be learned by coupling PSBH with techniques such as mass 
spectrometry. Target nucleic acid sequences can be hybridized to probes or 
array of probes as a method of sorting nucleic acids having distinct 
20 sequences without having a priori knowledge of the sequences of the 
various hybridization events. As each probe will be represented as multiple 
copies, it is only necessary that hybridization has occurred to isolate distinct 
sequence packages. In addition, as distinct packages of sequences, the>- can 
be amplified, modified or otherwise controlled for subsequent analysis. 
25 Amplification increases the number of specific sequences which assists in 
any analysis requiring increased quantities of nucleic acid while retainine 
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sequence specificity. Modification may involve chemically altering the 
nucleic acid molecule to assist with later or downstream analysis. 

Consequently, another important feature of the invention is the 
ability to simply and rapidly mass modify the sequences of interest, A mass 
5 modification is an alteration in the mass, typically measured in terms of 
molecular weight as daltons, of a molecule. Mass modification which 
increase the discrimination between at least two nucleic acids with single 
base differences in size or sequence can be used to facilitate sequencing 
using, for example, molecular weight determinations. 

10 One embodiment of the invention is directed to a method for 

sequencing a target nucleic acid using mass modified nucleic acids and mass 
spectrometry technology. Target nucleic acids which can be sequenced 
include sequences of deoxyribonucleic acid (DNA) or ribonucleic acid 
(RNA). Such sequences may be obtained from biological, recombinant or 

1 5 other man-made sources, or purified from a natural source such as a patient's 
tissue or obtained from environmental sources. Alternate types of molecules 
which can be sequenced includes polyamide nucleic acid (PNA) (P.E. 
Nielsen et al.. Sci. 254:1497*1500, 1991) or any sequence of bases joined 
by a chemical backbone that have the abilit>' to base pair or hybridize with 

20 a complementary chemical structure. 

The bases of DNA, RNA and PNA include purines, 
pyrimidines and purine and pyrimidine derivatives and modifications, which 
are linearly linked to a chemical backbone. Common chemical backbone 
structures are deoxyribose phosphate, ribose phosphate, and polyamide. The 

25 purines of both DNA and RNA are adenine (A) and guanine (G). Others 
that are known to exist include xanthine, hypoxanthine. 2- and 1- 
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diaminopurine, and other more modified bases. The pyrimidines are 
cytosine (C), which is common to both DNA and RNA, uracil (U) found 
predominantly in RNA, and thymidine (T) which occurs almost exclusively 
in DNA. Some of the more atypical pyrimidines include methylcytosine, 
5 hydroxymethyNcytosine, methyluracil, hydroxymethyluracil, 
dihydroxypentyluracil, and other base modifications. These bases interact 
in a complementary fashion to form base-pairs, such as, for example, 
guanine with cytosine and adenine with thymidine. This invention a so 
encompasses situations in which there is non-traditional base pairing such 

10 as Hoogsteen base pairing which has been identified in certain tRNA 
molecules and postulated to exist in a triple helix. 

Sequencing involves providing a nucleic acid sequence which 
is homologous or complementary to a sequence of the target. Sequences 
may be chemically synthesized using, for example, phosphoramidite 

1 5 chemistry or created enzymatically by incubating the target in an appropriate 
buffer with chain elongating nucleotides and a nucleic acid polymerase. 
Initiation and termination sites can be controlled with dideoxvnucleotides 
or oligonucleotide primers, or by placing coded signals directly into the 
nucleic acids. The sequence created may comprise any portion of the target 

20 sequence or the entire sequence. Alternatively, sequencing may involve 
elongating DNA in the presence of boron derivatives of nucleotide 
triphosphates. Resulting double-stranded samples are treated with a 3' 
exonuclease such as exonuclease III. This exonuclease stops when it 
encounters a boronated residue thereby creating a sequencing ladder. 

25 Nucleic acids can also be purified, if necessar\' to remove 

substances which could be harmful {e,g. toxins), dangerous {e.g. infectious) 
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or might interfere with the hybridization reaction or the sensitivity of that 
reaction (e.g. metals, salts, protein, lipids). Purification may involve 
techniques such as chemical extraction with salts, chloroform or phenol, 
sedimentation centrifugation, chromatography or other techniques known 
5 to those of ordinary skill in the art. 

If sufficient quantities of target nucleic acid are available and 
the nucleic acids are sufficiently pure or can be purified so that any 
substances which would interfere with hybridization are removed, a plurality 
of target nucleic acids may be directly hybridized to the array. Sequence 

1 0 information can be obtained without creating complementary or homologous 
copies of a target sequence. 

Sequences may also be amplified, if necessary or desired, to 
increase the number of copies of the target sequence using, for example, 
polymerase chain reactions (PCR) technology or any of the amplification 

15 procedures. Amplification involves denaturation of template DNA by 
heating in the presence of a large molar excess of each of two or more 
oligonucleotide primers and four dNTPs (dGTP, dCTP, dATP, dTTP). The 
reaction mixiure is cooled to a temperature that allows the oligonucleotide 
primer to anneal to target sequences, after which the annealed primers are 

20 extended with DNA polymerase. The cycle of denaturation, annealing, and 
DNA synthesis, the principal of PCR amplification, is repeated many times 
to generate large quantities of product which can be easily identified. 

The major product of this exponential reaction is a segment of 
double stranded DNA whose termini are defined by the 5' termini of the 

25 oligonucleotide primers and whose length is defined by the distance between 
the primers. Under normal reaction conditions, the amount of polymerase 
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becomes limiting after 25 to 30 cycles or about one million fold 
amplification. Further, amplification is achieved by diluting the sample 
1000 fold and using it as the template for further rounds of amplification in 
another PGR. By this method, amplification levels of 10^ to 10*^ can be 
5 achieved during the course of 60 sequential cycles. This allov^s for the 
detection of a single copy of the target sequence in the presence of 
contaminating DNA, for example, by hybridization with a radioactive probe. 
With the use of sequential PGR, the practical detection limit of PGR can be 
as low as 10 copies of DNA per sample. 

1 0 Although PGR is a reliable method for amplification of target 

sequences, a number of other techniques can be used such as Hgase chain 
reaction, self sustained sequence replication, QP replicase amplification, 
polymerase chain reaction linked ligase chain reaction, gapped ligase chain 
reaction, ligase chain detection and strand displacement amplification. The 

15 principle of ligase chain reaction is based in part on the ligation of rvvo 
adjacent synthetic oligonucleotide primers which uniquely hybridize to one 
strand of the target DNA or RNA. If the target is present, the two 
oligonucleotides can be covalently linked by ligase. A second pair of 
primers, almost entirely complementary to the first pair of primers is also 

20 provided. The template and the four primers are placed into a thermocycler 
with a thermostable ligase. As the temperature is raised and lowered, 
oligonucleotides are renatured immediately adjacent to each other on the 
template and ligated. The ligated product of one reaction serves as the 
template for a subsequent round of ligation. The presence of target is 

25 manifested as a DNA fragment with a length equal to the sum of the two 
adjacent oligonucleotides. 
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Target sequences are firagmented, if necessary, into a plurality 
of fragments using physical, chemical or enzymatic means to create a set of 
fragments of uniform or relatively uniform length. Preferably, the 
sequences are enzymatically cleaved using nucleases such as DNases or 
5 RNases (mung bean nuclease, micrococcal nuclease, DNase I, RNase A, 
RNase Tl), type I or II restriction endonucleases, or other site-specific or 
non-specific endonucleases. Sizes of nucleic acid fragments are between 
about 5 to about 1,000 nucleotides in length, preferably between about 10 
to about 200 nucleotides in length, and more preferably between about 12 

10 to about 100 nucleotides in length. Sizes in the range of about 5, 10, 12, 15, 
18, 20, 24, 26, 30 and 35 are useful to perform small scale analysis of short 
regions of a nucleic acid target. Fragment sizes in the range of 25, 50, 75, 
125, 150, 175, 200 and 250 nucleotides and larger are useful for rapidly 
analyzing larger target sequences. 

15 Target sequences may also be enzymatically synthesized 

using, for example, a nucleic acid polymerase and a collection of chain 
elongating nucleotides (NTPs, dNTPs) and limiting amounts of chain 
terminating (ddNTPs) nucleotides. This t>pe of polymerization reaction can 
be controlled by varying the concentration of chain terminating nucleotides 

20 to create sets, for example nested sets, which span various size ranges. In 
a nested set, fragments will have common one terminus and one terminus 
which will be different between the members of the set such that the larger 
fragments will contain the sequences of the smaller fragments. 

The set of fragments created, which may be either homologous 

25 or complementar>' to the target sequence, is hybridized to an array of nucleic 
acid probes forming a target array of nucleic acid probe/fragment 
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complexes. An airay constitutes an ordered or structured plurality of nucleic 
acids which may be fixed to a solid support or in liquid suspension. 
Hybridization of the fragments to the array allows for sorting of very large 
collections of nucleic acid fragments into identifiable groups. Sorting does 
5 not require a priori knowledge of the sequences of the probes, and can 
greatly facilitate analysis by, for example, mass spectrophotometric 
techniques. - 

Hybridization between complementary bases of DNA, RNA, 
PNA, or combinations of DNA, RNA and PNA, occurs under a wide varietx- 
10 of conditions such as variations in temperature, salt concentration, 
electrostatic strength, and buffer composition. Examples of these conditions 
and methods for applying them are described in Nucleic Acid Hybridization: 
A Practical Approach (B.D. Hames and S.J. Higgins, editors, IRL Press, 
1985). It is preferred that hybridization takes place bet^veen about 0°C and 
15 about yo-^C. for periods of from about one minute to about one hour, 
depending on the nature of the sequence to be hybridized and its length. 
However, it is recognized that hybridizations can occur in seconds or hours, 
depending on the conditions of the reaction. For example, typical 
hybridization conditions for a mixture of two 20.mers is to bring the mixmre 
20 to 68 °C and let cool to room temperature (22 ^C) for five minutes or at verv 
low temperatures such as I'C in 2 microliters. Hybridization between 
nucleic acids may be facilitated using buffers such as Tris-EDTA (TE), Tris- 
HCl and HEPES. salt solutions {e.g. NaCl, KCl, CaCK), other aqueous 
solutions, reagents and chemicals. Examples of these reagents include 
25 single-su-anded binding proteins such as Rec A protein. T4 gene 32 protein. 
E. coli single-stranded binding protein and major or minor nucleic acid 
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groove binding proteins. Examples of other reagents and chemicals include 
divalent ions, polyvalent ions and intercalating substances such as ethidium 
bromide, actinomycin D, psoralen and angelicin. 

Optionally, hybridized target sequences may be ligated to a 
5 single-strand of the probes thereby creating ligated target-probe complexes 
or ligated target arrays. Ligation of target nucleic acid to probe increases 
fidelity of hybridization and allovs^s for incorrectly hybridized target to be 
easily washed from correctly hybridized target. More importantly, the 
addition of a ligation step allows for hybridizations to be performed under 
10 a single set of hybridization conditions. Variation of hybridization 
conditions due to base composition are no longer relevant as nucleic acids 
with high A/T or G/C content ligate with equal efficiency. Consequently, 
discrimination is very high between matches and mis-matches, much higher 
than has been achieved using other methodologies wherein the effects of 
15 G/C content were only somewhat neutralized in high concentrations of 
quatemar>' or tertiar>' amines such as, for example, 3M tetramethyl 
ammonium chloride. Further, hybridization conditions such as temperatures 
of between about 22" C to about Zl^C, salt concentrations of between about 
0.05 M to about 0.5 M, and hybridization times of between about less than 
20 one hour to about 14 hours (overnight), are also suitable for ligation. 
Ligation reactions can be accomplished using a eukaryotic derived or a 
prokar> otic derived ligase such as T4 DNA or RNA ligase. Methods for use 
of diese and other nucleic acid modifying enz\mes are described in Current 
Protocols in Molecular Biology (F.M. Ausubel et al .. editors. John Wiley & 
25 Sons. 1989). 
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Each probe of the probe array comprises a single-stranded 
portion, an optional double-stranded portion and a variable sequence within 
the single-stranded portion. These probes may be DNA, RNA, PNA, or any 
combination thereof, and may be derived from namral sources or 
5 recombinant sources, or be organically synthesized. Preferably, each probe 
has one or more double stranded portions which are about 4 to about 30 
nucleotides in length, preferably about 5 to aboyt 15 nucleotides and more 
preferably about 7 to about 12 nucleotides, and may also be identical within 
the various probes of the array, one or more single stranded portions which 
1 0 are about 4 to 20 nucleotides in length, preferably between about 5 to about 
12 nucleotides and more preferably between about 6 to about 10 nucleotides, 
and a variable sequence within the single stranded portion which is about 4 
to 20 nucleotides in length and preferably about 4, 5, 6, 7 or 8 nucleotides 
in length. Overall probe sizes may range from as small as 8 nucleotides in 
1 5 lengths to 100 nucleotides and above. Preferably, sizes are from about 1 2 
to about 35 nucleotides, and more preferably, from about 12 to about 25 
nucleotides in length. 

Probe sequences may be partly or entirely known, 
determinable or completely unknown. Known sequences can be created, for 
20 example, by chemically synthesizing individual probes with a specified 
sequence at each region. Probes with determinable variable regions may be 
chemically synthesized with random sequences and the sequence 
information determined separately. Either or both the sin ele-stranded and 
the double-su-anded regions may comprise constant sequences such as. for 
25 e.xample, when an area of the probe or hybridized nucleic acid would benefit 
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from having a constant sequence as a point of reference in subsequent 
analyses. 

An advantage of this type of probe is in its structure. 
Hybridization of the target nucleic acid is encouraged due to the favorable 
5 thermodynamic conditions, including base-stacking interactions, established 
by the presence of the adjacent double strandedness of the probe. Probes 
may' be structured with terminal single-stranded regions which consist 
entirely or partly of variable sequenceSv internal single-stranded regions 
which contain both constant and variable regions, or combinations of these 

10 structures. Preferably, the probe has a single-stranded region at one 
terminus and a double-stranded region at the opposite terminus. 

Fragmented target sequences, preferably, will have a 
distribution of terminal sequences sufficiently broad so that the nucleotide 
sequence of the hybridized fragments will include the entire sequence of the 

15 target nucleic acid. Consequently, the typical probe array will comprise a 
collection of probes with sufficient sequence diversity in the variable 
regions to hybridize, with complete or nearly complete discrimination, all 
of the target sequence or the target-derived sequences. The resulting target 
array will comprise the entire target sequence on strands of hybridized 

20 probes. By way of example only, if the variable portion consisted of a four 
nucleotide sequence (R==4) of adenine, guanine, thymine, and c\^osine, the 
total number of possible combinations (4*^) would be 4^ or 256 different 
nucleic acid probes. If the number of nucleotides in the variable sequence 
was five* the number of different probes within the set would be 4^ or K024. 

25 In addition^ it is also possible to utilize probes wherein the variable 
nucleotide sequence contains gapped segments, or positions along the 
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variable sequence which will base pair with any nucleotide or at least not 
interfere with adjacent base pairing. 

A nucleic acid strand of the target array may be extended or 
elongated enzymatically. Either the hybridized fragment or one or the other 
5 of the probe strands can be extended. Extension reactions can utilize various 
regions of the target array as a template. For example, when fragment 
sequences are longer than the hybridizable portion of a probe having a 3' 
single-stranded terminus, the probe will have a 3' overhang and a 5' 
overhang after hybridization of the fragment. The now internal 3' terminus 
10 of the one strand of the probe can be used as a primer to prime an e.xtension 
reaction using, for example, an appropriate nucleic acid polymerase and 
chain elongating nucleotides. The extended strand of the probe will contain 
sequence information of the entire hybridized fragment. Reaction mixtures 
containing dideoxynucleotides will create a set of extended strands of 
1 5 varying lengths and, preferably, a nested set of strands. As the fragments 
have been initially sorted by hybridization to the array, each probe of the 
array will contain sets of nucleic acids that represent each segment of the 
target sequence. Base sequence information can be determined from each 
extended probe. Compilation of the sequence information from the array. 
20 which may require computer assistance with very large arrays, will allow 
one to detemiine the sequence of the target. Depending on the structure of 
the probe {e.g. 5' overhang, 3' overhang, internal single-stranded region), 
strands of the probe or strands of hybridized nucleic acid containing target 
sequence can also be enzymatically amplified by. for example, single primer 
25 PGR reactions. Variations of this process may involve aspects of strand 
displacement amplification, QP repiicase amplification, self-sustained 
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sequence replication amplification and any of the various polymerase chain 
reaction amplification technologies. 



modified using a variety of techniques and methodologies. The most 
5 straight forward may be to enzymatically synthesize the extension utilizing 
a polymerase and nucleotide reagents, such as mass modified chain 
elongating and chain terminating nucleotides. Mass modified nucleotides 
incorporate into the growing nucleic acid chain. Mass modifications may 



be introduced in most sites of the macromolecule which do not interfere 
10 with the hydrogen bonds required for base pair formation during nucleic 
acid hybridization. Typical modifications include modification of the 
heterocyclic bases, modifications of the sugar moiety (ribose or 
deoxyribose), and modifications of the phosphate group. Specifically, a 
modifying functionality, which ma y be a chemical moi ety, is placed at or 
1 5 covalently coupled to the C2, N3, N7 or N8 positions of purines, or the N7 
or N9 positions of deazapurines. Modifications may also be placed at the 
C5 or C6 positions of pyrimidines (e^g. Figures lA, IB, 2 A and 2B), 



Examples of useful rnodifying groups include deuterium/F, CK Br, I, biotin 



^JTuorescein, iododicarbocyanine dye, SiR, Si(CH3)3, Si(CH3)2(C2H5), 
20 SiiCH,).(C,Hsh. Si(CH )iC H ^ ,5 Si(C H ) 3 ) CH , ^ (CH )3NR. 

CH2CONK (CH2)nOH, CH.F, CHF. and CFji wherein n is an integer and R 
is selected from the group consisting of -H, deuterium and alky Is, alkoxys 
and ar\'ls of 1-6 carbon atoms, polyoxymethylene, monoalkylated 
polyoxymethylene, polyethylene imine, polyamide, polyester, alkylated 
25 silyL hetero-oligo/polyaminoacid and polyethylene glycol (Figures 3 and 4). 



Extended nucleic acid strands of the probe can be mass 
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Mass modifying functionalities may also be generated from 
a precursor functionality such as .N3 or >XR, wherein X is: -OH, -NH, - 
NHR, -SH. -NCS, -OCO(CH,)„COOH, .NHCO(CH^„COOH, -OSOpH, 
-OCO(CH,y or -OP(0.alkyl).N-(aIkyl)„ and n is an integer from 1 to 20; 
5 and R is: -H, deuterium and alkyls, alkoxys or aryls of 1-6 carbon atoms,' 
such as methyl, ethyl, propyl, isopropyl, t-butyl, hexyl, benzyl, benzhydraj' 
trityl, substituted trityl, aryl, substituted aryl, polyoxymethylene' 
monoalkylated polyo>^methylene, polyethylene imine, polyamide 
polyester, alkylated si lyl, heterooligo/polyaminoacid or polyethylene glycol. 
1 0 These and other mass modifying functionalities which do not interfere with 
hybridization can be attached to a nucleic acids either alone or in 
combination. Preferably, combinations of different mass modifications are 
utilized to maximize distinctions between nucleic acids having different 
sequences. 

'5 Mass modifications may be major changes of molecular 

weight, such as occurs with coupling between a nucleic acid and a 
heterooligo/polyaminoacid, or more minor such as occurs by substituting 
chemical moieties into the nucleic acid having molecular masses smaller 
than the natural moiety. Non-essemial chemical groups may be eliminated 
20 or modified using, for example, an alkylating agent such as iodoacetamide. 
Alkylation of nucleic acids with iodoacetamide has an additional advantage 
that a reactive oxygen of the 3'-position of the sugar is eliminated. This 
provides one less site per base for alkali cations, such as sodium, to interact. 
Sodmm, present in nearly all nucleic acids, increases the likelihood of 
25 forming satellite adduct peaks upon ionization. Adduct peaks appear at a 
sl.ghtly greater mass than the true molecule which would greativ reduce the 
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accuracy of molecular weight determinations. These problems can be 
addressed, in part, with matrix selection in mass spectrometric analysis, but 
this only helps with nucleic acids of less than 20 nucleotides. Ammonium 
C'NHs), which can substitute for the sodium cation CNa) during ion 
5 exchange, does not increase adduct formation. Consequently, another useful 
mass modification is to remove alkali cations from the entire nucleic acid. 
This can be accomplished by ion exchange with aqueous solutions of 
ammonium such as ammonium acetate, ammonium carbonate, diammonium 
hydrogen citrate, ammonium tartrate and combinations of these solutions. 
10 DNA dissolved in 3 M aqueous ammonium hydroxide neutralizes all the 
acidic functions of the molecule. As there are no protons, there is a 
significant reduction in fragmentation during procedures such as mass 
spectrometry. 

Another mass modification is to utilize nucleic acids with non- 
15 ionic polar phosphate backbones (e,g. PNA). Such nucleotides can be 
generated by oligonucleoside phosphomonothioate diesters or by enzymatic 
synthesis using nucleic acid polymerases and alpha- (a-) thio nucleoside 
triphosphate and subsequent alkylation with iodoacetamide. Synthesis of 
such compounds is straight forward and can be performed and the products 
20 separated and isolated by, for example, analytical HPLC. 

Mass modification of arrays can be performed before or after 
target hybridization as the modification do not interfere with hybridization 
of or hybridized nucleic. This conditioning of the array is simply to perform 
and easily adaptable in bulk. Probe arrays can therefore be synthesized with 
25 no special manipulations. Only after the arrays are fixed to solid supports. 
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just in fact when it would be most convenient to perfonn mass modification, 
would probes be conditioned. 

Probe strands may also be mass modified subsequent to 
synthesis by, for example, contacting by treating the extended strands with 
5 an alkylating agent, a thiolating agent or subjecting the nucleic acid to cation 
exchange. Nucleic acid which can be modified include target sequences, 
probe" sequences and strands, extended strands of the probe and other 
available fragments. Probes can be mass modified on either strand prior to 
hybridization. Such arrays of mass modified or conditioned nucleic acids 

10 can be bound to fragments containing the target sequence with no 
interference to the fidelity of hybridization. Subsequent extension of either 
strand of the probe, for example using Sanger sequencing techniques, and 
using the target sequences as templates will create mass modified extended 
strands. The molecular weights of these strands can be determined with 

1 5 excellent accuracy. 

Probes may be in solution, such as in wells or on the surface 
of a micro-u-ay. or attached to a solid support. Mass modification can occur 
while the probes are fixed to the support, prior to fixation or upon cleavage 
from the support which can occur concurrently with ablation when analyzed 

20 by mass spectrometry. In this regard, it can be important which strand is 
released from the support upon laser ablation. Preferably, in such cases, the 
probe is differentially attached to the support. One strand may be permanent 
and the other temporarily attached or, at least, selectively releasable. 

Examples of solid supports which can be used include a 

25 plastic, a ceramic, a metal, a resin, a gel and a membrane. Usefijl types of 
solid supports include plates, beads, microbeads. whiskers, combs. 
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hybridization chips, membranes, single crystals, ceramics and self- 
assembling monolayers, A preferred embodiment comprises a two- 
dimensional or three-dimensional matrix, such as a gel or hybridization chip 
with multiple probe binding sites (Pevzner et al., J. BiomoL Struc. & Dyn. 
5 9:399-410, 1991; Maskos and Southern, Nuc. Acids Res, 20: 1679-84, 1992). 
Hybridization chips can be used to construct very large probe arrays which 
are subsequently hybridized with a target nucleic acid. Analysis of the 
hybridization pattern of the chip can assist in the identification of the target 
nucleotide sequence. Patterns can be manually or computer analyzed, but 

10 it is clear thai positional sequencing by hybridization lends itself to 
computer analysis and automation. Algorithms and software have been 
developed for sequence reconstruction which are applicable to the methods 
described herein (R. Drmanac et ah, J. Biomol. Struc. & Dyn. 5: 1085- 1 1 02, 
1991: P. A. Pevzner, J. Biomol. Struc. & Dyn. 7:63-73. 1989). 

15 Nucleic acid probes may be attached to the solid support by 

covalent binding such as by conjugation with a coupling agent or by. 
covalent or non-covalent binding such as electrostatic interactions, hydrogen 
bonds or antibody-antigen coupling, or by combinations thereof. Typical 
coupling agents include biotin/avidin, biotin/streptavidin. Staphylococcus 

20 aureus protein A/IgG antibody fragmenu and slreptavidin/protein A 
chimeras (T. Sano and C.R. Cantor, BioATechnology 9:1378-81, 1991), or 
derivatives or combinations of these agents. Nucleic acids may be attached 
to the solid suppon by a photocleavable bond, an electrostatic bond, a 
disulfide bond, a peptide bond, a diester bond or a combination of these sorts 

25 of bonds. The array may also be attached to the solid support by a 
selectively releasable bond such as 4,4'-dimeihox\trityl or its derivative. 
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Derivatives which have been found to be useful include 3 or 4 [bis-(4- 
niethoxyphenyI)]-methyI-benzoic acid, N-succinimidyl- 3 or 4 [bis-(4- 
methoxyphenyl)]-methyl-benzoic acid, N-succinimidyl- 3 or 4 [bis-(4- 
methoxyphenyl)]-hydroxymethyl-benzoic acid, N-succinimidyl- 3 or 4 [bis- 
5 (4-methoxyphenyl)]-chloromethyi-ben2oic acid, and salts of these acids. 

Binding may be reversible or permanent where strong 
associations would be critical. In addition, probes may be attached to solid 
supports via spacer moieties between the probes of the array and the solid 
support. Useful spacers include a coupling agent, as described above for 
1 0 binding to other or additional coupling partners, or to render the attachment 
to the solid support cleavable. 

Cleavable attachments may be created by attaching cleavable 
chemical moieties between the probes and the solid support such as an 
oligopeptide, oligonucleotide, oligopolyamide, oligoacryl amide, 
1 5 oligoethylene glycerol, alkyl chains of between about 6 to 20 carbon atoms, 
and combinations thereof These moieties may be cleaved with added 
chemical agents, electromagnetic radiation or enzymes. Examples of 
attachments cleavable by enzymes include peptide bonds which can be 
cleaved by proteases and phosphodiester bonds which can be cleaved b> 
20 nucleases. Chemical agents such as p-mercaptoethanol, dithiothreitol (DTT) 
and other reducing agents cleave disulfide bonds. Other agents which ma>- 
be useful include oxidizing agents, hydrating agents and other selectively 
active compounds. Electromagnetic radiation such as ultraviolet, infrared 
and visible light cleave photocleavable bonds. Attachments may also be 
25 reversible such as, for example, using heat or enzymatic treatment, or 
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reversible chemical or magnetic attachments. Release and reattachment can 
be perfonned using, for example, magnetic or electrical fields. 

Hybridized probes can provide direct or indirect information 
about the hybridized sequence. Direct information may be obtained from 
5 the binding pattern of the array wherein probe sequences are known or can 
be determined. Indirect informarion requires additional analysis of a 
plurality of nucleic acids of the target array. For example, a specific nucleic 
acid sequence will have a unique or relatively unique molecular weight 
depending on its size and composition. That molecular weight can be 
10 determined, forexample, by chromatography (e.g. HPLC), nuclear magnetic 
resonance (NMR), high-definition gel electrophoresis, capillar)' 
electrophoresis (e.g. HPCE), spectroscopy or mass spectrometry. 
Preferably, molecular weights are detemiined by measuring the mass/charge 
ratio with mass spectrometry technology. 



15 



Mass spectrometry of biopolymers such as nucleic acids can 
be performed using a variety of techniques (e.g. U.S. Patent Nos. 4,442,354; 
4,931,639; 5002,868; 5,130,538;5.135,870; 5,174,962). Difficulties 
associated with volatization of high molecular weight molecules such as 
DNA and RNA have been overcome, at least in part, with advances in 
10 techniques, procedures and electronic design. Further, only small quantities 
of sample are needed for analysis, the typical sample being a mixture of 10 
or so fragments. Quantities which range from between about 0. 1 femtomole 
to about 1 .0 nanomole. preferably between about 1 .0 femtomole to about 
1000 femtomoles and more preferably between about 10 femtomoles to 
5 about 1 00 femtomoles are typically sufficient for analysis. These amounts 
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can be easily placed onto the individual positions of a suitable surface or 
attached to a support. 

Another of the important features of this invention is that it is 
unnecessaiy to volatile large lengths of nucleic acids to determine sequence 
5 information. Using the methods of the invention, segments of the nucleic 
acid target, discretely isolated into separate complexes on the target array, 
can be sequenced and those sequence segments collated making it 
unnecessary to have to volatize the entire strand at once. Techniques which 
can be used to volatize a nucleic acid fragment include fast atom 
10 bombardment, plasma desorption, matrix-assisted laser 
desorption/ionization, electrospray, photochemical release, electrical release, 
droplet release, resonance ionization and combinations of these techniques. 

In electrohydrodynamic ionization, thermospray, aerospray 
and electrospray, the nucleic acid is dissolved in a solvent and injected with 
1 5 the help of heat, air or electricity, directly into the ionization chamber. If the 
method of ionization involves a light beam, panicle beam or electric 
discharge, the sample may be attached to a surface and introduced into the 
ionization chamber. In such situations, a plurality of samples may be 
attached to a single surface or multiple surfaces and introduced 
20 simultaneously into the ionization chamber and still analyzed individually. 
The appropriate sector of the surface which contains the desired nucleic acid 
can be moved to proximate the path an ionizing beam. After the beam is 
pulsed on and the surface bound molecules are ionized, a different sector of 
the surface is moved into the path of the beam and a second sample, with the 
25 same or different molecule, is analyzed without reloading the machine. 
Multiple samples may also be introduced at electrically isolated regions of 
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a surface. Different sectors of the chip are connected to an electrical source 
and ionized individually. The surface to which the sample is attached may 
be shaped for maximum efficiency of the ionization method used. For field 
ionization and field desoiption, a pin or sharp edge is an efficient solid 
5 support and for particle bombardment and laser ionization, a flat surface. 

The goal of ionization for mass spectroscopy is to produce a 
whole molecule with a charge. Preferably, a matrix-assisted laser 
desorption/ioni2:ation (MALDI) or electrospray (ES) mass spectroscopy is 
used to determine molecular weight and thus, sequence information from 

10 the target array. It will be recognized by those of ordinary skill that a 
variety of methods may be used which are appropriate for large molecules 
such as nucleic acids. Typically, a nucleic acid is dissolved in a solvent and 
injected into the ionization chamber using electrohydrodynamic ionization, 
thermospray, aerospray or electrospray. Nucleic acids may also be attached 

15 to a surface and ionized with a beam of particles or light. Particles which 
have successfully used include plasma (plasma desorption), ions (fast ion 
bombardment) or atoms (fast atom bombardment). Ions have also been 
produced with the rapid application of laser energy (laser desorption) and 
electrical energy (field desorption). 

20 In mass spectrometer analysis, the sample is ionized briefly by 

a pulse of laser beams or by an electric field induced spray. The ions are 
accelerated in an electric field and sent at a high velocity into the analyzer 
portion of the spectrometer. The speed of the accelerated ion is direciK 
proportional to the charge (z) and inversely proportional to the mass (m) of 

25 the ion. The mass of the molecule may be deduced from the flight 
characierisiics of its ion. For small ions, the r>'pical detector has a magnetic 
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field which functions to constrain the ions stream into a circular path. The 
radii of the paths of equally charged particles in a uniform magnetic field is 
directly proportional to mass. That is, a heavier particle with the same 
charge as a lighter particle will have a larger flight radius in a magnetic 
5 field. It is generally considered to be impractical to measure the flight 
characteristics of large ions such as nucleic acids in a magnetic field because 
the relatively high mass to charge (m/z) ratio requires a magnet of unusual 
size or strength. To overcome this limitation the electrospray method, for 
example, can consistently place multiple ions on a molecule. Multiple 
10 charges on a nucleic acid will decrease the mass to charge ratio allowing a 
conventional quadrupole analyzer to detect species of up to 1 00,000 daltons. 

Nucleic acid ions generated by the matrix assisted laser 
desorption/ionization only have a unit charge and because of their large 

mass, generally require analysis by a time of night analyzer. Time of flight 
15 analyzers are basically long tubes with a detector at one end. In the 
operation of a TOF analyzer, a sample is ionized briefly and accelerated 
down the tube. After detection, the time needed for travel down the detector 
tube is calculated. The mass of the ion may be calculated fi-om the time of 
flight. TOF analyzers do not require a magnetic field and can detect unit 
20 charged ions with a mass of up to 1 00,000 daltons. For improved resolution, 
the time of flight mass spectrometer may include a reflectron. a region at the 
end of the flight tube which negatively accelerates ions. Moving particles 
entering the reflectron region, which contains a field of opposite polarity to 
the accelerating field, are retarded to zero speed and then reverse accelerated 
25 out with the same speed but in the opposite direction. In the use of an 
analyzer with a reflectron. the detector is placed on the same side of the 
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flight tube as the ion source to detect the returned ions and the effective 
length of the flight tube and the resolution power is effectively doubled. 
The calculation of mass to charge ratio from the time of flight data takes into 
account of the time spent in the reflectron. 
5 Ions with the same charge to mass ratio will typically leave the 

ion accelerators with a range of energies because the ionization regions of 
a mass spectrometer is not a point source. Ions generated further away from 
the flight tube, spend a longer time in the accelerator field and enter the 
flight tube at a higher speed. Thus ions of a single species of molecule will 

1 0 arrive at the detector at different times. In time of flight analysis, a longer 
time in the flight tube in theory provide more sensitivity, but due to the 
different speeds of the ions, the noise (background) will also be increased. 
A reflectron, besides effectively doubling the effective length of the flight 
tube, can reduce the error and increase sensitivit>' by reducing the spread of 

1 5 detector impingement time of a single species of ions. An ion with a higher 
velocity will enter the reflectron at a higher velocity and stay in the 
reflectron region longer than a lower velocity ion. [f the reflectron electrode 
voltages are arranged appropriately, the peak width contribution from the 
initial velocity distribution can be largely corrected for at the plane of the 

20 detector. The correction provided by the reflectron leads to increased mass 
resolution for all stable ions, those which do not dissociate in flight, in the 
spectmm. 

While a linear field reflectron functions adequately to reduce 
noise and enhance sensiiivit>\ reflecirons with more complex field strengths 
25 offer superior correctional abilities and a number of complex refiectrons can 
be used. The double stase reflectron has a flrst region with a weaker electric 
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field and a second region with a stronger electric field. The quadratic and 
the curve field reflectron have a electric field which increases as a function 
of the distance. These functions, as their name implies, may be a quadratic 
or a complex exponential function. The dual stage, quadratic, and curve 
5 field reflectrons, while more elaborate are also more accurate than the linear 
reflectron. 

^ The detection of ions in a mass spectrometer is typically 
performed using electron detectors. To be detected, the high mass ions 
produced by the mass spectrometer is converted into either electrons or low 

1 0 mass ions at a conversion electrode. These electrons or low mass ions are 
then used to start the electron multiplication cascade in an electron 
multiplier and further amplified with a fast linear amplifier. The signals 
ft-om multiple analysis of a single sample are combined to improve the 
signal to noise ratio and the peak shapes, which also increase the accuracy 

15 of the mass determination. 

This invention is also directed to the detection of multiple 
primary ions directly through the use of ion cyclotron resonance and Fourier 
analysis. This is useful for the analysis of a complete sequencing ladder 
immobilized on a surface. In this method, a plurality of samples are ionized 
20 at once and the ions are captured in a cell with a high magnetic field. An RF 
field excites the population of ions into cyclotron orbits. Because the 
frequencies of the orbits are a function of mass, an output signal 
representing the spectrum of the ion masses is obtained. This output is 
analyzed by a computer using Fourier analysis which reduces the combined 
25 signal to its component frequencies and thus provides a measurement of the 
ion masses present in the ion sample. Ion cyclotron resonance and Fourier 
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analysis can determine the masses of all nucleic acids in a sample. The 
application of this method is especially useflil on a sequencing ladder. 

The data from mass spectrometry, either performed singly or 
in parallel (multiplexed), can determine the molecular mass of a nucleic acid 
5 sample. The molecular mass, combined with the known sequence of the 
sample, can be analyzed to determine the length of the sample. Because 
different bases have different molecular weight, the output of a high 
resolution mass spectrometer, combined with the known sequence and 
reaction history of the sample, will determine the sequence and length of the 

10 nucleic acid analyzed. In the mass spectroscopy of a sequencing ladder, 
generally the base sequence of the primers are known. From a known 
sequence of a certain length, the added base of a sequence one base longer 
can be deduced by a comparison of the mass of the two molecules. This 
process is continued until the complete sequence of a sequencing ladder is 

15 determined. 

Another embodiment of the invention is directed to a method 
for detecting a target nucleic acid. As before, a set of nucleic acids 
complementary or homologous to a sequence of the target is hybridized to 
an array of nucleic acid probes. The molecular weights of the hybridized 

20 nucleic acids determined by, for example, mass spectrometry and the nucleic 
acid target detected by the presence of its sequence in the sample. As the 
object is not to obtain extensive sequence information, probe arrays may be 
fairly small with the critical sequences, the sequences to be detected, 
repeated in as many variations as possible. Vaunations may have greater than 

25 95% homologj' to the sequence of interest, greater than 80%, greater than 
70% or greater than about 60%. Variations may also have additional 
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sequences not required or present in the target sequence to increase or 
decrease the degree of hybridization. Sensitivity of the array to the target 
sequence is mcreased while reducing and hopefully eliminating the number 
of false positives. 

5 Target nucleic acids to be detected may be obtained from a 

biological sample, an archival sample, an environmental sample or another 
source expected to contain the target sequence. For example, samples may 
be obtained from biopsies of a patient and the presence of the target 
sequence is indicative of the disease or disorder such as, for example, a 

1 0 neoplasm or an infection. Samples may also be obtained from 
environmental sources such as bodies of water, soil or waste sites to detect 
the presence and possibly identify organisms and microorganism which may 
be present in the sample. The presence of particular microorganisms in the 
sample may be indicative of a dangerous pathogen or that the normal flora 

1 5 is present. 

Another embodiment of the invention is directed to the arravs 
of nucleic acid probes useful in the above-described methods and 
procedures. These probes comprise a first strand and a second strand 
wherein the first strand is hybridized to the second strand forming a double- 

20 stranded portion, a single-stranded portion and a variable sequence within 
the single-stranded portion. The array may be attached to a solid support 
such as a material that facilitates volatization of nucleic acids for mass 
spectrometry. Typically, arrays comprise large numbers of probes such as 
less than or equal to about 4*^ different probes and R is the length in 

25 nucleotides of the variable sequence. When utilizing arrays for large scale 
sequencing, larger arrays can be used whereas, arrays which are used for 
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10 



detection of specific sequences may be fairly small as many of the potential 
sequence combinations will not be necessary. 

Arrays may also comprise nucleic acid probes which are 
entirely single-stranded and nucleic acids which are single-stranded, but 
possess hairpin loops which create double-stranded regions. Such structures 
can function in a manner similar if not identical to the partially single- 
stranded probes, which comprise two strands of nucleic acid, and have the 
additional advantage of thenmodynamic energy available in the secondary 
structure. 



Arrays may be in solution or fixed on a solid support through 
streptavidin-biotin interactions or other suitable coupling agents. Arrays 
may also be reversibly fixed to the solid support using, for example, 
chemical moieties which can be cleaved with electromagnetic radiation, 
chemical agents and the like. The solid support may comprise materials 
15 such as matrix chemicals which assist in the volatization process for mass 
spectrometric analysis. Such chemicals include nicotinic acid, 3'- 
hydroxypicolnic acid, 2,5-dihydroxyben2oic acid, sinapinic acid succinic 
acid, glycerol, urea and Tris-HCl, pH about 7.3. 

Another embodiment of the invention is directed to 
sequencing double-stranded nucleic acids using strand-displacement 
polymerization. With this method it is unnecessary to denature the double- 
su^ds to obtain sequence information. Strand-displacement polymerization 
creates a new strand while simultaneously displacing the existing strand. 
Techniques for incorporating label into the growing strand are well-know 
25 and the newly polymerized strand is easily detected by, for example, mass 
spectromeuy. 



20 
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Target nucleic acid or nucleic acids containing sequences that 
correspond to the sequence of the target arc digested, for example, with 
restriction en2ymes, in one or more steps to create a set of fragments which 
are partially single-stranded and partially double-stranded. Another set of 
5 nucleic acids, the probes, are also partially single-stranded and partially 
double-stranded. These probes preferably contain a variable or constant 
regions within the single-stranded portion of the terminus of each fragment 
(5'- or 3 -overhangs). Probes or fragments are treated with a phosphatase lo 
remove phosphate groups from the 5'-termini of the nucleic acids. 
10 Phosphatase treatment prevents nucleic acid ligation by ligase which 
requires a terminal 5 -phosphate to covalently link to a 3*-hydroxyL Single- 
stranded regions of the fragments are hybridized to single-stranded regions 
of the probes forming an array of hybridized target/probe complexes. 
Adjacent or abutting nucleic acid strands of the complex are ligated, 
15 covalently joining a strand of the fragment to a strand of the probe. 
Phosphatase treatment prevents both self-ligation of phosphatase-treated 
nucleic acids and ligation between the 5'-termini of phosphatased nucleic 
acids and the 3*-termini of untreated nucleic acids. These complexes are 
treated with a nucleic acid polymerase that recognizes and bind to the nick 
20 in the unligated strand to initiate polymerization. The polymerase 
synthesizes a new strand using the ligated stand as a template, while 
displacing the complementary strand. The reaction may be supplemented 
with labeled or mass modified nucleotides (e.g, mass modifications ai 
positions C2, N3, N7 or C8 of purine, or at N7 or N9 of deazapurine) or 
25 other detectable markers that will allow for the detection of new svnthesis. 
Either the probes or the fragments may be fixed to a solid support such as 
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a plastic or glass surface, membrane or structure (magnetic bead) which 
eliminates the need for repetitive extractions or other purification of nucleic 
acids between steps. 

Preferably, double-stranded nucleic acids containing target 
5 sequences are obtained by polymerase chain reaction or enzymatic digestion 
(e.g. restriction enzymes) of the target sequence. Target sequences may be 
DNA, RNA, RNA/DNA hybrids, cDNA, PNA or modifications or 
combinations thereof and are preferably from about 10 to about 1,000 
nucleotides in length, more preferably, from about 20 to about 500 

10 nucleotides in length, and even more preferably, from about 35 to about 250 
nucleotides in length. 5'-termini of the nucleic acid fragments or probes may 
be dephosphorylated with a phosphatase, such as alkaline or calf intestinal 
phosphatase, which eliminates the action of a nucleic acid ligase. Upon 
hybridization of fragment to probe, only one of the two internal 5*-3' 

15 junctions contains a 5*-phosphate and is capable of ligation. The second 
junction appears as a nick in a strand of the complex. Nucleic acid 
polymerases, such as Klenow, recognize the nick and synthesize a new 
strand while displacing the complementar>', ligated strand. Chain elongation 
can proceed in the presence of, for example, nucleotide triphosphates and 

20 chain terminating nucleotides. Nucleic acid synthesis terminates when a 
dideoxynucleotide is incorporated into the elongating strand. The resulting 
fragments represent a nested set of the sequence of the target. Precursor 
nucleotides may be labeled with, for example, mass modifications. The 
mass modified fragments can be easily analyzed by mass spectrometry to 

25 determine the sequence of the target. Complexes may further comprise 
single-stranded binding protein (SSB; £. coli) which increases stabilit>' of 
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the complex and facilitate polymerase action. Bands otherwise obscured are 
more easily detected. SSB can be used to sequence fragments of greater 
than 100 nucleotides, preferably greater than 150 nucleotides and more 
preferably greater than 200 nucleotides. 
5 This method is generally useful for manual or automated 

nucleic acid sequencing, and especially usefiil for identifying and 
sequencing a single or group of nucleic acid species in a mixed background 
containing a plurality of species of different sequences. In this method, 
selection is performed upon hybridization and ligation of fragments to 
1 0 probes. Probes may be designed to contain a common or variable sequence 
within the single-stranded region that is complementary to a sequence of the 
fragment to be identified and, if desired, sequenced. Stringency of 
fragment/probe hybridization can be adjusted by methods well-known to 
those of ordinary skill to match desired conditions of selection. For 
1 5 example, the single-stranded region of the probe can be designed to contain 
a specific sequence only found on the singie-stranded region of the nucleic 
acid fragment of interest. Alternatively, multiple probes containing multiple 
variable regions may be used to select for those fragment sequences which 
may be longer than the length of the single-stranded region of any one 
20 probe. Hybridization and ligation selects the specific fragment from a 
complex mixture of different fragments and only that specific fragment is 
subsequently sequenced. 

Probes are typically from about 15 to about 200 nucleotides 
in length, but can be larger or small depending on the particular application. 
25 Single-sfranded regions of the probes may be about 3, 4, 5, 6, 7, 8, 9, 10, 12, 
15. 20, 22, 25 or 30 nucleotides in length or larger. For probes containing 
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a variable region within the single-stranded region, the length of this 
variable region may be the same or smaller than the length of the entire 
single-stranded portion. Variable regions may be distinct between probes 
or common within sets of probes. The double-stranded region of the probe 
5 is typically larger than the single-stranded region and may be about 4, 5, 6, 
7. 8, 9, 10, 12, 14, 16, 18, 20. 22, 24, 26, 28, 30, 35 40 or 50 nucleotides in 
len^ or larger. Probes may also be modified to facilitate attachment to a 
solid support or other surfaces, or modified to be individual detectable for 
identification or other purposes. Sets of nucleic acids, either fragments or 
10 probes, preferably contain greater than 10^ 10\ 10\ 10^ 10*, 10\ 10^ 10' 
or 10'° different members. 

Another embodiment of the invention is directed to kits for 
detecting a sequence of a target nucleic acid. An array of nucleic acid 
probes is fixed to a solid suppon which may be coated with a matrix 
1 5 chemical that facilitates volatization of nucleic acids for mass spectrometry. 
Kits can be used to detect diseases and disorders in biological samples by 
detecting specific nucleic acid sequences which are indicative of the 
disorder. Probes may be labeled with detectable labels which only become 
detectable upon hybridization with a correctly matched target sequence. 
20 Detectable labels include radioisotopes, metals, luminescent or 
bioluminescent chemicals, fluorescent chemicals, enzymes and 
combinations thereof 

Another embodiment of the invention is directed to nucleic 
acid sequencing systems which comprise a mass spectrometer, a computer 
25 loaded with appropriate software for analysis ofnucleic acids and an array 
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of probes which can be used to capture a target nucleic acid sequence. 
Systems may be manual or automated as desired. 

The following experiments are offered to illustrate 
embodiments of the invention, and should not be viewed as limiting the 
5 scope of the invention. 
Examplqg 

Example 1 Preparation of Target Nucleic Acid. 

Target nucleic acid is prepared by restriction endonuclejise 
cleavage of cosmid DNA. The properties of type II and other restriction 
1 0 nucleases that cleave outside of their recognition sequences were exploited. 
A restriction digestion of a 10 to 50 kb DNA sample with such an enzyme 
produced a mixture of DNA firagments most of which have unique ends. 
Recognition and cleavage sites of useful enzymes are shown in Table 1 . 

Table 1 

15 Restriction Enzymes and Recognition Sites for PSBH 

i 

Mwol GCNNNNN-NNGC 

CGNN-NNNNNCG 
I 

20 1 

Esi YI CCNNNNN-NNGG 

GGNN-NNNNNCC 
t 

1 

25 Apa BI GCANNNNN-TGC 

CGT-NNNNNACG 
f 

i 

Mnll CCTCN, 
30 GG AGNfc 

t 

J 
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TspRI 



NNCAGTGNN 
NNGTCACNN 



5 



Cjel 



CCANNNNNN-GTNNNN 
GGTNNNNNN-CAhnvINN 



Cje PI 



CCANNNNN-NNTCNN 
GGTNNNNN-NNAGNN 



10 



One restriction enzyme, ApoB 15, with a 6 base pair 



recognition site may also be used. DNA sequencing is best served by 
15 enzymes that produce average fragment lengths comparable to the lengths 
of DNA sequencing ladders analyzable by mass spectrometry. At present 
these lengths are about 100 bases or less. 



to digest DNA in preparation of PSBH. Target DNA from is cleaved to 
20 completion and complexed w^ith PSBH probes either before or after melting. 
The fraction of fragments with unique ends or degenerate ends depends on 
the complexity of the target sequence. For example^ a 10 kilobase clone 
would yield on average 16 fragments or a total of 32 ends since each double- 
stranded DNA target produces two ligatable 3' ends. With 1024 possible 
25 ends, Poisson statistics (Table 2) predict that there would be 3% 
degeneracies. In contrast, a 40 kilobase cosmid insert would yield 64 
fragments or 128 ends, of which, 12% of these would be degenerate and a 
50 kilobase sample would yield 80 fragments or 160 ends. Some of these 
would surely be degenerate. Up to at least 1 00 kilobase, the larger the target 
30 the more sequence are available from each multiplex DNA sample 



BsiYl and Mwo I restriction endonucleases are used together 
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preparation. With a 100 kilobase target, 27% of the targets would be 
degenerate. 

Table 2 

Poisson Distribution of Restriction Eozyme Sites 

Target size Mwol TspR I 

(kb) Sequencing Assembly Sequencing Assembly 

10 0.97 0.60 0.94 0.94 

40 0.88 0.14 0.80 0.80 

• 100 0.73 0.01 0.57 0.57 



With BsiYl and Mwo I, any restriction site that yields a unique 
5 base end may be captured twice and the resulting sequence data obtained 
will read away from the site in both directions (Figure 5). With the 
knowledge of three bases of overlapping sequence at the site, this sorts all 
15 sequences into 64 different categories. With 10 kilobase targets, 60% will 
contain fragments and, thus sequence assembly is automatic. 

Two array capture methods can be used with Mwo I and BsiY 
I. In the first method, conventional five base capture is used. Because the 
two target bases adjacent to the capture site are known, they from the 
20 restriction enzyme recognition sequence, an alternative capture strategy 
would build the complement of these two bases into the capture sequence. 
Seven base capture is thermodynamically more stable, but less 
discriminating against mismatches. 

TspR I is another commercially available restriction enzyme 
25 with properties that are very attractive for use in PSBH-mediated Sanger 
sequencing. The method for using TspR I is shown in Figure 6. TspR I has 
a five base recognition site and cuts two bases outside this site on each 
strand to yield nine base 3' single-stranded overhangs. These can be 
captured with partially duplex probes with complementar>' nine base 
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overhangs. Because only four bases are not specified by enzyme 
recognition, TspR I digest results in only 256 types of cleavage sites. With 
human DNA die average fragment length that results is 1370 bases. This 
enzyme is ideal to generate long Sequence ladders and are useful to input to 
long thin gel sequencing where reads up to a kilobase are common. A 
typical human cosmid yields about 30 TspR I fragments or 60 ends. Given 
the length distribution expected, many of these could not be sequenced fully 
from one end. With 256 possible overhangs, Poisson statistics (Table 2) 
indicate that 80% adjacent fragments can be assembled with no additional 
labor. Thus, very long blocks of continuous DNA sequence are produced. 

Three additional restriction enzymes are also useful. These 
are Mnl I, Cje I and CjeP I (Table 1 ). The first has a four base site with one 
A+T should give smaller human DNA fragments on average than Mwo I or 
BsiY\. The latter Wio have unusual interrupted five base recognition sites 
15 and might supplement Tj;?/? I. 

Target DNA may also be prepared by tagged PCR. It is 
possible to add a preselected five base 3' terminal sequence to a target DNA 
using a PCR primer five bases longer than the known target sequence 
priming site. Samples made in this way can be captured and sequenced 

20 using the PSBH approach based on the five base tag. A biotin was used to 
allow purification of the complementary strand prior to use as an 
immobilized sequencing template. A biotin may also be placed on the tag. 
After capture of the duplex PCR product by streptavidin-coated magnetic 
microbeads, the desired strand (needed to serve as a sequencing template) 

25 could be denatured from the duplex and used to contact the entire probe 
array. For muhipIe.N sample preparation, a series of different five base 
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tagged primers would be employed, ideally in a single multiplex PCR 
reaction This approach also requires knowing enough target sequence for 
unique PCR amplification and is more useful for shotgun sequencing or 
comparative sequencing than for ife «ovo sequencing. 
5 Example 2 Basic Aspects of Pmit?^n.i <i.^y ^r,cm^ hv HY^ riii7^ri-^ 

An examination of the potential advantages of stacking 
hybridization has been carried out by both calculations and pilot 
experiments. Some calculated T„'s for perfect and mismatched duplexes are 
shown in Figure 7. These are based on average base compositions. The 
1 0 calculations revealed that the binding of a second oligomer next to a pre- 
formed duplex provides an extra stability equal to about two base pairs and 
that mis-pairing seems to have a larger consequence on stacking 
hybridization than it does on ordinary hybridization. Other types of mis- 
pairing are less destabilizing, but these can be eliminated by requiring a 
15 ligation step. In standard SBH, a terminal mismatch is the least 
destabilizing event, and leads to the greatest source of ambiguity or 
background. For an octanucleotide complex, an average terminal mismatch 
leads to a 6'C lowering in T„. For stacking hybridization, a terminal 
mismatch on the side away from the pre-existing duplex, is the least 
20 destabilizing event. For a pentamer, this leads to a drop in T„ of 1 CC. 
These considerations indicate that the discrimination power of stacking 
hybridization in favor of perfect duplexes are greater than ordinary SBH. 
Examples Prenaratmn ^.f Model Arr^y c 

In a single synthesis, all 1024 possible single-stranded probes 
25 with a constant 1 8 base stalk followed by a variable 5 base extension can be 
created. The 1 8 base extension is designed to contain Kvo restriction 
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enzyme cutting sites. Hga I generates a 5 base, 5' overfiang consisting of the 
variable bases N5. Not I generates a 4 base, 5' overhang at the constant end 
of the oligonucleotide. The synthetic 23-mer mixture hybridized with a 
complementary 18-mer forms a duplex which can be enzymatically 

5 extended to form all 1024, 23-mer duplexes. These are cloned by, for 
example, blunt end ligation, into a plasmid which lacks Not I sites. Colonies 
containing the cloned 23 -base insert are selected and each clone contains 
one unique sequence. DNA minipreps can be cut at the constant end of the 
stalk, filled in with biotinylated pyrimidines and cut at the variable end of 

10 the stalk to generate the 5 base 5' overhang. The resulting nucleic acid is 
fractionated by Qiagen columns (nucleic acid purification columns) to 
discard the high molecular weight material. The nucleic acid probe will then 
be attached to a streptavidin-coated surface. This procedure could easily be 
automated in a Beckman Biomec or equivalent chemical robot to produce 

15 many identical arrays of probes. 

The initial array contains about a thousand probes. The 
particular sequence at any location in the array will not be known. 
However, the array can be used for statistical evaluation of the signal to 
noise ratio and the sequence discrimination for different target molecules 

20 under different hybridization conditions. Hybridization with known nucleic 
acid sequences allows for the identification of particular elements of the 
array. A sufficient set of hybridizations would train the array for any 
subsequent sequencing task. Arrays are partially characterized until they 
have the desired properties. For example, the length of the oligonucleotide 

25 duplex, the mode of its attachment to a surface and the hybridization 
conditions used can all be varied using the initial set of cloned DNA probes. 
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Once the sort of array that works best is determined, a complete and fully 
characterized array can be constructed by ordinary chemical synthesis. 
Example 4 Preparation of Snecifir Prn^^ /^,n-fiy^ 

With positional SBH, one potential trick to compensate for 
5 some variations in stability among species due to GC content variation is to 
provide GC rich stacking duplex adjacent AT rich overhangs and AT rich 
stacking duplex adjacent GC rich overhangs. Moderately dense arrays can 
be made using a typical x-y robot to spot the biotinylated comp :>unds 
individually onto a streptavidin-coated surface. Using such robots, it is 
10 possible to make arrays of 2 x 10^ samples in 100 to 400 cm^ of nominal 
surface. Commercially available streptavidin-coated beads can be adhered, 
permanently to plastics like polystyrene, by exposing the plastic first to a 
brief treatment with an organic solvent like triethylamine. The resulting 
plastic surfaces have enormously high biotin binding capacit>' because of the 
15 very high surface area that results. 

In certain experiments, the need for attaching oligonucleotides 
to surfaces may be circumvented altogether, and oligonucleotides attached 
to streptavidin-coated magnetic microbeads used as already done in pilot 
experiments. The beads can be manipulated in microliter plates. A 
20 magnetic separator suitable for such plates can be used including the newlv 
available compressed plates. For example, the 18 by 24 well plates 
(Genetix. Ltd.; USA Scientific Plastics) would allow containment of the 
entire array in 3 plates. This format is well handled by existing chemical 
robots. It is preferable to use the more compressed 36 by 48 well format so 
25 the entire array would fit on a single plate. The advantages of this approach 
for all the experiments are that any potential complexities from surface 
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effects can be avoided and already-existing liquid handling, thermal control 
and imaging methods can be used for all the experiments. 

Lastly, a rapid and highly efficient method to print arrays has 
been developed. Master arrays are made which direct the preparation of 
5 replicas or appropriate complementary arrays. A master array is made 
manually (or by a very accurate robot) by sampling a set of custom DNA 
sequences in the desired pattern and then transferring these sequences to the 
replica. The master array is just a set of all 1 024-4096 compounds printed 
by multiple headed pipettes and compressed by offsetting. A potentially 
10 more elegant approach is shown in Figure 8. A master array is made and 
used to transfer components of the replicas in a sequence-specific way. The 
sequences to be transferred are designed to contain the desired 5 or 6 base 
5' variable overhang adjacent to a unique 15 base DNA sequence. 

The master array consists of a set of streptavidin bead- 
15 impregnated plastic coated metal pins. Immobilized biotinylated DNA 
strands that consist of the variable 5 or 6 base segment plus the constant 1 5 
base segment are at each tip. Any unoccupied sites on this surface are filled 
with excess free biotin. To produce a replica chip, the master arra\ is 
mcubated with the complement of the 1 5 base constant sequence, 5'-iabeled 
20 with biotin. Next, DNA polymerase is used to synthesize the complement 
of the 5 or 6 base variable sequence. Then the wet pin array is touched to 
the streptavidin-coated surface of the replica and held at a temperature abo\e 
the T^ of the complexes on the master array. If there is insufficient liquid 
carryover from the pin array for efficient sample transfer, the replica arra> 
25 could first be coated with spaced droplets of solvent, either held in concave 
cavities or delivered b>' a multi-head pipenor. After the transfer, the replica 
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chip is incubated with the complement of 15 base constant sequence to 
reform the double-stranded portions of the array. The basic advantage of 
this scheme is that the master array and transfer compounds are made only 
once and the manufacture of replica arrays can proceed almost endlessly. 
5 Examples Attachment Of Nuclg ic Acids Probes tn SnHH «;.,p p^rr^ 

Nucleic acids may be attached to silicon wafers or to beads. 
A silicone solid support was derivatized to provide iodoacetyl functionalities 
on its surface. Derivatized solid support were bound to disulfide containing 
oligodeoxynucleotides. Alternatively, the solid support may be coated with 
10 streptavidin or avidin and bound to biotinylated DNA. 

Covalent attachment of oligonucleotide to derivatized chips: 
Silicon wafers are chips with an approximate weight of 50 mg. To maintain 
uniform reaction condition, it was necessar>' to determine the exact weight 
of each chip and select chips of similar weights for each experiment. The 
1 5 reaction scheme for this procedure is shown in Figure 9. 

To derivatize the chip to contain the iodoacetyl ftinctionalitv 
an anhydrous solution of 25% (by volume) 3-aminopropyltriesho.xysilane 
in toluene was prepared under argon and aliquotted (700 ^1) into tubes. A 
50 mg chip requires approximately 700 m) of silane solution. Each chip was 
20 flamed to remove any surface contaminants during its manufacmre and 
dropped into the silane solution. The tube containing the chip was placed 
under an argon environment and shaken for approximately three hours. 
After this time, the silane solution was removed and the chips were washed 
three times with toluene and three times with dimethyl sulfoxide (DMSO). 
25 A 10 mM solution of N-succinimidyl(4.iodoacetyI)aminobenzoate (SIAB) 
(Pierce Chemical Co.: Rockford, IL) was prepared in anhydrous DMSO and 
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added to the tube containing a chip. Tubes were shaken under an argon 
environment for 20 minutes. The SIAB solution was removed and after 
three washes with DMSO, the chip was ready for attachment to 
oligonucleotides. 

5 Some oligonucleotides were labeled so the efficiency of 

attachment could monitored. Both .5' disulfide containing 
oligodeoxynucleotides and unmodified oligodeoxynucleotides were 
radiolabeled using terminal deoxynucleotidyl transferase enzyme and 
standard techniques. In a typical reaction, 0.5 mM of disulfide-containing 
1 0 oligodeoxynucleotide mix was added to a trace amount of the same species 
that had been radiolabeled as described above. This mixture was incubated 
with dithiothreitol (DTT) (6.2 ^mol, 100 mM) and 
ethylenediaminetetraacetic acid (EDTA) pH 8.0 (3 ^xmol, 50 mM). EDTA 
served to chelate any cobalt that remained from the radiolabeling reaction 
1 5 that would complicate the cleavage reaction. The reaction was allowed to 
proceed for 5 hours at 37»C. With the cleavage reaction essentially 
complete, the free thiol-containing oligodeoxynucleotide was isolated using 
a Chromaspin-IO column. 

Similarly, Tris-(2-carboxyethyl)phosphine (TCEP) (Pierce 
20 Chemical Co.; Rockford, IL) has been used to cleave the disulfide. 
Conditions utilize TCEP at a concentration of approximately 1 00 mM in pH 
4.5 buffer. It is not necessary to isolate the product following the reaction 
since TCEP does not competitively react with the iodoacet> l ftmctionaliiv. 

To each chip which had been derivatized to contain the 
25 iodoacet>l flinctionality was added to a 10 ^M solution of the 
oligodeoxynucleotide at pH 8. The reaction was allowed to proceed 
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overnight at room temperature. In this manner, two different 
oligodeoxynucleotides have been examined for their ability to bind to the 
iodoacetyl silicon wafer. The first was the free thiol containing 
oligodeoxynucleotide already described. In parallel with the free thiol 
5 containing oligodeoxynucleotide reaction, a negative control reaction has 
been performed that employs a 5' unmodified oligodeoxynucleotide. This 
species has similarly been 3* radiolabeled, but due to the unmodified 5' 
terminus, the non-covalent, non-specific interactions may be determined. 
Following the reaction, the radiolabeled oligodeoxynucleotides were 
10 removed and the chips were washed 3 times with water and quantitation 
proceeded. 

To determine the efficiency of attachment, chips of the wafer 
were exposed to a phosphorimager screen (Molecular Dynamics). This 
exposure usually proceeded overnight, but occasionally for longer periods 

15 of time depending on the amount of radioactivity incorporated. For each 
different oligodeoxynucleotide utilized, reference spots were made on 
polystyrene in which the molar amount of oligodeoxynucleotide was known. 
These reference spots were also exposed to the phosphorimager screen. 
Upon scanning the screen, the quantity (in moles) of oligodeoxynucleotide 

20 bound to each chip was determined by comparing the counts to the specific 
activities of the references. Using the weight of each chip, it is possible to 
calculate the area of the chip: 

(g of chip) (1 130 mm*/g) - x mm*^ 
By incorporating this value, the amount of oligodeoxynucleotide bound to 

25 each chip may be reported in frnol/mm^. It is necessary to divide this value 
by two since a radioactive signal of ^"P is strong enough to be read through 
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the silicon wafer. Thus the instrument is essentially recording the 
radioactivity from both sides of the chip. 

Following the initial quantitation each chip was washed in 5 
X SSC buffer (75 mM sodium citrate, 750 mM sodium chloride^ pH 7) with 
5 50% formamide at 65**C for 5 hours. Each chip was washed three times 
with warm water, the 5 x SSC wash was repeated, and the chips 
requantitated. Disulfide linked oligonucleotides were removed from the 
chip by incubation with 100 mM DTT at 37*'C for 5 hours. 
Example 6 Attachmen t of Nucleic Acids to Streptavidin Coated Solid 

10 SiiKPsal. 

Immobilized single-stranded DNA targets for solid-phase 
DNA sequencing were prepared by PGR amplification. PGR was performed 
on a Perkin Elmer Cetus DNA Thermal Cycler using Vent^ (exo ) DNA 
polymerase (New England Biolabs; Beverly. MA), and dNTP solutions 

15 (Promega; Madison, WI). EcoR I digested plasmid NB34 (a PCR^m h 
plasmid with a one kb target anonymous human DNA insert) was used as 
the DNA template for amplification. PCR was performed with an 18- 
nucleotide upstream primer and a downstream 5'-end biotinylated 18- 
nucleotide primer. PCR amplification was carried out in a 1 00 \xl or 400 j-il 

20 volume containing 1 0 mM KCI, 20 mM Tris-HCl (pH 8.8 at 25 X), 1 0 mM 
(NH4)2S04, 2 mM MgSO^, 0,1% Triton X-IOO, 250 dNTPs, 2.5 
biotinylated primer, 5 ^M non-biotinylated primer, less than 100 ng of 
plasmid DNA, and 6 units of Vent (exo ) DNA polymerase per 100 (il of 
reaction volume. Thirty temperature cycles were performed which included 

25 a heat denaturation step at 94 ^'C for 1 minute, followed by annealing of 
primers to the template DNA for 1 minute at 60'*C, and DNA chain 
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extension wia, Vent (exo) polymerase for 1 n,inme at 72'C For 
amplification with the tagged primer. 45 -C was selected for primer 
annealmg. The PCR product was purified through a Ultt^ee-MC 30 000 
NMWL filter unit (Millipore; Bedford. MA) or by electrophoresis 'and 
extraction from a low melting agarose gel. About 1 0 pmol of purified PCR 
fragment was mixed with 1 mg of prewashed magnetic beads coated with 
^eptavidin (Dynabeads M280. Dynal, Nonvay) in 100 ^ of 1 M NaCl and 
TE mcubating at 37'C or 45-C for 30 minutes. 

Tlie magnetic beads were used directly for double stranded 
sequencing. For single stranded sequencing, the immobilized biotinylated 
double-stranded DNA fragment was convened ,o single-stranded fonr, by 
treating wio, freshly prepared 0.1 M NaOH at room temperature for 5 

tnmutes. magnetic beads, with immobilized single-stranded DNA were 
washed with 0. 1 M NaOH and TE before use. 
1 5 Example 7 HvhriHi....-^^ 'r-ffi-jtr 

Hybridization was performed using probes with five and siv 
base pair overhangs, including a five base pair ma.ch, a five base pair 
mismatch, a six base pair match, and a six base pair mismatch. These 
sequences are depicted in Table 3. 
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Table 3 
Hybridized Test Sequences 

Test Sequ«?nfy.f 

5 bp overlap, perfect match: 

^ 3'-TCG AGA ACC TTG GCT*-5' 

3'-CTA CTA GGC TGC GTA GTC 

5'.biotin-GAT GAT CCG ACG CAT CAG AGC TC-3' 

5 bp overlap, mismatch at 3' end: 

, - 3'-TCG AGA ACC TTG GCT*-5' 

1 0 3 '-CTA CTA GGC TGC GTA GTC 

5'-biotin-GAT GAT CCG ACG CAT CAG AGC TI-3' 

6 bp overlap, perfect match: 

3'-TCG AGA ACC TTG GCT*-5' 
3*-CTA CTA GGC TGC GTA GTC 
1 5 5'-biotin-GAT GAT CCG ACG CAT CAG AGC TCT-3' 

6 bp overlap, mismatch four bases from 3' end: 

3'-TCG AGA ACC TTG GCT*-5' 
3'-CTA CTA GGC TGC GTA GTC 
S'-biotin-GAT GAT CCG ACG CAT CAG AGT TCT-3' 

20 



The biotinylated double-stranded probe was prepared in TE 
buffer by annealing tlie complimentary single strands together at 68 °C for 
five minutes followed by slow cooling to room temperature. A five-fold 

25 excess of monodisperse, polystyrene-coated magnetic beads (Dynal) coated 
with streptavidin was added to the double-stranded probe, which as then 
incubated with agitation at room temperature for 30 minutes. After ligation, 
the samples were subjected to two cold (4''C) washes followed by one hot 
(90"C) wash in TE buffer (Figure 10). The ratio of ^-P in the hot 

30 supernatant to the total amount of "P was determined (Figure 11). At high 
NaCl concentrations, mismatched target sequences were either not annealed 
or were removed in the cold washes. Under the same conditions, the 
matched target sequences were annealed and ligated to the probe. The final 
hot wash removed the non-biotinylated probe oligonucleotide. This 



(SEQIDNO 1) 
(SEQ ID NO 2) 
(SEQ ID NO 3) 

(SEQ IDNO 1) 
(SEQ ID NO 2) 
(SEQ ID NO 4) 

(SEQ IDNO 1) 
(SEQ ID NO 2) 
(SEQ ID NO 5) 

(SEQ IDNO 1) 
(SEQ ID NO 2) 
(SEQ ID NO 6) 
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oligonucleotide contained the labeled target if the target had been ligated to 
the probe. 

Example 8 Compensating for Variations in Base Composition. 

The Dependence on Tm on base composition, and on base 
5 sequence may be overcome with the use of salts like tetramethyl ammonium 
halides or betaines. Alternatively, base analogs like 2,6-diamino purine and 
5-bromo U can be used instead of A and T, respectively, to increase the 
stability of A-T base pairs, and derivatives like T-deazaG can be us^d to 
decrease the stability of G-C base pairs. The initial Experiments shown in 

10 Table 2 indicate that the use of enzymes will eliminate many of the 
complications due to base sequences. This gives the approach a very 
significant advantage over non-enzymatic methods which require different 
conditions for each nucleic acid and are highly matched to GC content. 

Another approach to compensate for differences in stabilit>' is 

15 to var}' the base next to the stacking site. Experiments were performed to 
test the relative effects of all four bases in this position on overall 
hybridization discrimination and also on relative ligation discrimination 
other base analogs such as dU (deoxyuridine) and 7-deazaG may also be 
useful to suppress effects of secondary structure. 

20 Example 9 DNA Ligat ion to Oligonucleotide Arrays . 

£. call and T4 DNA ligases can be used to covalently attach 
hybridized target nucleic acid to the correct immobilized oligonucleotide 
probe. This is a highly accurate and efficient process. Because ligase 
absolutely requires a correctly base paired 3' terminus, ligase will read only 

25 the 3*-terminal sequence of the target nucleic acid. After ligation, the 
resulting duplex will be 23 base pairs long and it will be possible to remove 
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unhybridized, unligated target nucleic acid using fairly stringent washing 
conditions. Appropriately chosen positive and negative controls 
demonstrate the specificity of this method, such as arrays which are lacking 
a 5'-terminal phosphate adjacent to the 3' overhang since these probes will 
5 not ligate to the target nucleic acid. 

There are a number of advantages to a ligation step. . Physical 
specificity is supplanted by enzymatic specificity. Focusing on the 3' end 
of the target nucleic also minimize problems arising fi-om stable secondary 
structures in the target DNA. DNA ligases are also used to covalently attach 
1 0 hybridized target DNA to the correct immobilized oligonucleotide probe. 
Several tests of the feasibility of the ligation method shown in Figure 12. 
Biotinylated probes were attached at 5' ends (Figure 12A) or 3' ends (Figure 
I2B) to streptavidin-coated magnetic microbeads, and annealed with a 
shoner, complementary', constant sequence to produce duplexes with 5 or 
15 6 base single-stranded overhangs. "P-end labeled targets were allowed to 
hybridi2e to the probes. Free targets were removed by capturing the beads 
with a magnetic separator. DNA ligase was added and ligation was allou ed 
to proceed at various salt concentrations. The samples were washed at room 
temperature, again manipulating the immobilized compounds with a 
20 magnetic separator to remove non-ligated material. Finally, samples were 
incubated at a temperature above the T„ of the duplexes, and eluted single 
strand was retained after the remainder of the samples were removed bN 
magnetic separation. The eluaie at this point consisted of the ligated 
material. The fi-action of ligation was estimated as the amount of ^-P 
25 recovered in the high temperature wash versus the amoum recovered in both 
the high and low temperature washes. Results indicated that salt conditions 
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can be found where the ligation proceeds efficiently with perfectly matched 
5 or 6 base overhangs, but not with G-T mismatches. The results of a more 
extensive set of similar experiments are shown in Tables 4-6, 

Table 4 looks at the effect of the position of the mismatch and 
5 Table 5 examines the effect of base composition on the relative 
discrimination of perfect matches verses weakly destabilizing mismatches. 
These data demonstrate that effective discrimination between perfect 
rnatches and single mismatches occurs with all five base overhangs tested 
and that there is little if any effect of base composition on the amount of 

1 0 ligation seen or the effectiveness of match/mismatch discrimination. Thus, 
the serious problems of dealing with base composition effects on stabilit>' 
seen in ordinary SBH do not appear to be a problem for positional SBH. 
Furthermore, as the worst mismatch position was the one distal from the 
phosphodiester bond formed in the ligation reaction, any mismatches that 

15 survived in this position would be eliminated by a polymerase extension 

reaction. A polymerase such as Sequenase version 2, that has no 3'- 

endonuclease activity or terminal transferase activit}- would be useful in this 

regard. Gel electrophoresis analysis confirmed that the putative ligation 

products seen in these tests were indeed the actual products synthesized. 

20 Table 4 

Ligation Efliciency of Matched and Mismatched Duplexes 

in 0,2MNaaat 37°C 

(SEQ ID NO 1 ) y-TCG AGA ACC TTG GCT-5' 

25 tJ^ation Efficiency 

CTA CTA GGC TGC GTA GTC-5* (SEQ ID NO 2) 

5*-B- GAT GAT CCG ACG CAT CAG ACC TC 0.170 (SEQ ID NO 5) 

5*-B- GAT GAT CCG ACG CAT CAG AGC TT 0.006 (SEQ ID NO 4 ) 

5-B- GAT GAT CCG ACG CAT CAG AGC TA 0.006 fSEO ID NO 7) 

30 5'-B- GAT GAT CCG ACG CAT CAG AGC CC 0.002 (SEQ ID NO 8) 

5**B- GAT GAT CCG ACG CAT CAG AGT TC 0.004 (SEO ID NO 9} 
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5'-B- GAT GAT CCG ACG CAT CAG AAC TC 0.001 (SEQ ID NO 10) 



Table 5 

5 Ligation Efficiency of Matched and Mismatched Duplexes in 
0.2 M NaCl at ST^C and its Dependance on AT Content of the 
Overhang 

Qygrhang SgquenggS at Content Ligation Efficiency 

10 

Match GGCCC 0/5 0.30 

Mismatch GGCCT 0.03 

Match AGCCC 1/5 0.36 

15 Mismatch AGCTC 0.02 

Match AGCTC 2/5 0.17 

Mismatch AGCTT 0.01 

20 Match AGATC 3/5 0.24 

Mismatch AGATT 0.01 

Match ATATC 4/5 0.17 

Mismatch ATATT 0.01 

25 

Match ATATT 5/5 0.31 

Mismatch ATATC 0.02 
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Table 6 

Increasing Discrimination by Sequencing Extension at37''C 

5 Ligation F,mn>nry Ligation Fv;| ffn«iion 

(percent) (+) (\ 
(SEQIDNOl) 3'-TCG AGA ACC TTG GCT.5- 

CTA CTA GGC TGC GTA GTC-5' (SEQIDN0 2) 

^ 5'.B- GAT GAT CCG ACG CAT CAG AGA TC 0.24 4 934 29 500 

10 (SEQIDNOll) * 

5'-B- GAT GAT CCG ACG CAT CAG AGC TT fiiU 116 2^0 

(SEQ1DN0 4) ^ ^ 

Discrimination" x24 x42 xllS 

1 5 (SEQ ID NO I ) 3'-TCG AGA ACC TTG GCT-5- 

CTA CTA GGC TGC GTA GTC-S' (SEQ ID NO 2) 

5'.B. GAT GAT CCG ACG CAT CAG ATA TC 0.17 12^50 ''S 200 

(SEQ ID NO 12) • 

5'-B- GAT GAT CCG ACG CAT CAG ATA TT MI 240 390 

20 (SEQ ID NO 13) ^ mi 

Discrimination = xl7 x51 x65 



25 




-radioactive label 



The discrimination for the correct sequence is not as great with 
an external mismatch (which would be the most difficult case to 
discriminate) as with an internal mismatch (Table 6). A mismatch right at 
the ligation point would presumably offer the highest possible 
30 discrimination. In any event, the results shown are very promising. Already 
there is a level of discrimination with only 5 or 6 bases of overlap that is 
better than the discrimination seen in conventional SBH with 8 base 
overlaps. 

Example 10 Capture and Seouencinp nf g Target N,.r| p }^ ^ r>j^ 

A mixture of target DNA was prepared by mixing equal molar 
ratio of eight different oligos. For each sequencing reaction, one specific 
partially duplex probe and eight different targets were used. The sequence 
of the probe and the targets are shown in Tables 7 and 8. 



35 
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Table 7 
Duplex Probes Used 



(DF25) S'.F-GATGATCCGACGCATCAGCIQIfi rqFnrnMn,^^ 

5 3'-CTACTAGGCTGCGTAGTC ™^ Is^^No'l 

(DF37) 5'-F-GATGATCCGACGCATCACICAA£ (SEO ID NO ! 5) 
3'-CTACTAGGCTGCGTAGTG ^ ' 



(SEQ ID NO 2) 



10 {DF22) 5'-F-GATGATCCGACGCATCAG AATGT .cpq in mo 

- 3'-CTACTAGGCTGCGTAGTC ^S^Td NO 'J 

(DF28) S'-F-GATGATCCGACGCATCAG CCTAG 

J 3 3--CTACTAGGCTGCGTAGTC ^ IsE^q'^d'^NO:! 

(DF36) 5'-F-GATGATCCGACGCATCAG TCGAC .cpo in MO i 

3--CTACTAGGCTGCGTAGTC '(SEqTd no:) 

(DFl la) S'-F-GATGATCCGACGCATCA CAGCTC ,opn ,n mo . 

20 3 '-CTACTAGGCTGCGTAGTG ' ^ ' 



(SEQ ID NO 2) 



fDF8a) 5--F-GATGATCCGACGCATCAA GGCrr rSFOinMO^n. 

3'-CTACTAGGCTGCGTAGTT (SEQ ID NO 20) 



(SEO ID NO 2) 



25 



Table 8 
Mixture of Targets 



Ma'ch 

30 ^^"^^ 3'.ILM:ACCGGATCGAGCCGGGTCGATCTAG (DF22) 

JNB4 5) 3'-£KiAICGACCGGGTCGATCTAG {DF28) ('seSiDNo''; 

^DF5) 3--Ai2CIQCCGGATCGAGCCGGGTCGATCTAG (DF36) '° " ' 

<TS10) 3--ICQAQAACC-ITGGCT (DFlIa) S^FnlnMo^.'! 

35 (NB3 10^ I'rrrr-r" r/-^ A-T-^-T-.,-^ it-riia) (SEQ ID NO 24) 

»^ iO) 3 £C<2iKiTCGATCTAG (DF8a) (SEQ ID NO 25) 



Mismatrl^ 



40 



fill 7) ."S??^IS^i^^^^G^T^CGATCTAG (DF8a) (SEQ ID NO 26) 

^339) ('^^"^> (SEQ ID NO 27 

(NB3.9) 3 -A£2CCfiGGTCGATCTAG (DF36) (SEQ ID NO "'8 



Two pmol of each of the tuo duplex-probe-forming 
oligonucleotides and 1.5 pmol of each of the eight different targets were 
45 mixed in a 10 mI volume containing 2 mI of Sequenase buffer slock (200 mM 
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Tris-HCU pH 7.5, 100 mM MgCI^, and 250 mM NaCI) from the Sequenase 
kit. The annealing mixture was heated to 65 °C and allowed to cool slowly 
to room temperature. While the reaction mixture was kept on ice, 1 ^1 0. 1 
M dithiothreitol solution, 1 ^1 Mn buffer (0. 1 5 M sodium isocitrate and 0. 1 
5 M MnCl^), and 2 ^1 of diluted Sequenase ( 1 .5 units) were mixed, and the 2 
lil of reaction mixture was added to each of the four termination mixes at 
room temperature (each consisting of 3 ^il of the appropriate termination 
mix: 16 MM dATP, 16 mM dCTP, 16 dGTP, 1 6 nM dTTP and 3.2 
of one of the four ddNTPs, in 50 mM NaCl). The reaction mixtures were 
1 0 further incubated at room temperature for 5 minutes, and terminated with the 
addition of 4 ^il of Phamiacia stop mix (deionized formamide containing 
dextran blue 6 mg/ml). Samples were denatured at 90-95 "C for 3 minutes 
and stored on ice prior to loading. Sequencing samples were analyzed on 
an ALF DNA sequencer (Pharmacia Biotech: Piscataway, NJ) using a 1 0% 
1 5 polyacrylamide gel containing 7 M urea and 0.6 x TBE. Sequencing results 
from the gel reader are shown in Figure 13 and summarized in Table 9. 
Matched targets hybridized correctly and are sequenced, whereas 
mismatched targets do not hybridize and are not sequenced. 
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Table 9 

Summary of Hybridization Data 



Reaction 


Hybridization 




Cc»mm?m 


1 


Probe: DF25 Target: mixture 


No 


mismatch 


2 


Probe: DF37 Target: mixture 


No 


mismatch 


3 


Probe: DF22 Target: mixture 


Yes 


match 


4 


Probe: DF28 Target: mixture 


Yes 


match 


5 


Probe: DF36 Target: mixture 


Yes 


match 


6- 


Probe: DFUa Target: mixture 


Yes 


match 


7 


Probe: DF8a Target: mixture 


Yes 


match 


8 


Probe: DF8a Target: NB3.4 


No 


mismatch 


9 


Probe: DF8a Target: TS12 


No 


mismatch 


10 


Probe: DF37 Target: DF5 


No 


mismatch 



15 

Example 1 1 Elongation of Nuglgi? Aci<js gounti to Solid SuppQrts- 

Elongation was carried out either by using Sequenase version 
2.0 kit or an AutoRead sequencing kit (Pharmacia Biotech; Piscataway, NJ) 
employing T7 DNA polymerase. Elongation of the immobilized single- 

20 stranded DNA target was performed with reagents from the sequencing kits 
for Sequenase Version 2.0 or T7 DNA polymerase, A duplex DNA probe 
containing a 5-base 3' overhang was used as a primer. The duplex has a 5'- 
fluorescein labeled 23-mer, containing an 1 8-base 5' constant region and a 
5-base 3' variable region (which has the same sequence as the 5'-end of the 

25 corresponding nonbiotinylated primer for PCR amplification of target DNA, 
and an 18-mer complementar>' to the constant region of the 23-mer. The 
duplex was formed by annealing 20 pmol of each of the two 
oligonucleotides in a 10 jil volume containing 2 \il of Sequenase buffer 
stock (200 mM Tris-HCl, pH 7.5, 100 mM MgCK and 250 mM NaCl) from 

30 the Sequenase kit or in a 13 pi volume containing 2 pi of the annealing 
buffer (1 M Tris-HCK pH 7.6, 100 mM MgCK) from the AutoRead 
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sequencing kit. TT,e annealing mixture wa. h.a.ed ,o 65 X and allowed to 
coci siowly to 37-C over a 20-30 ntinute time period. The duplex primer 
was annealed with the immobilized single-stranded DNA target by adding 
.he annealing mixmre to ti,e DNA-containing magnetic beads and the 
"sultmg mixture was Iteher incubated a. 37-C for 5 minutes, room 
.emperature for 10 minutes, and f„,ally 0"C for at least 5 minutes For 
Scquenase reactions, 1 ^1 0. 1 M dithiothreitol solution, 1 m Mn buffer (0 1 5 

M sod.u„ isocittate and 0. 1 M MnCy for the relative short targe, and 2 ,1 
of d,luted Sequenase (1 .5 units) were added, and the reaction mixture was 
d,v,ded into four ice cold termination mixes (each consists of 3 m1 of the 
apptopnate temtination mix: 80 dATP. 80 mM dCTP, 80 iCTP 80 
MM dTTP and 8 mM of one of the four ddNTPs. in 50 mM NaCl) ForTT 
DNA polymerase reactions, 1 m of extension buffer (40 mM McCl, pH 7 5 
304 mM citric acid and 324 mM DTT) and 1 ^ of T7 DNA polvmerase (8 
un,.s) were mixed, and the reaction volume was split into four ice cold 
.ermmat,„n mixes (each consisting of 1 p, DMSO and 3 ,1 of the 
appropriate termination mix: 1 mM dATP, 1 mM dCTP, 1 mM dCTP 1 
mM dTTP and 5 MM of one of the four ddNTPs. in 50 mM NaCI and 40 mM 
Tns-HCI. PH 7.4). The reaction mixtures for both enzymes were funher 
mcubated a, 0=C for 5 minutes, room temperature for 5 minutes and 37-C 
for 5 mtnutes. After the completion of extension, the supernatant was 
removed, and the magnetic beads were re-suspended in 10 ,1 of Phannaca 

Slop mi.\. Samples were denatured at 90-9'? »r < ■ , . 

' ^"^^ ^ for 5 minutes (under this 

harsh condition, both DNA template and the dideoxy fragments are released 
from the beads, and stored on ice prior to loading. A control experiment 
was performed in parallel using a 1 8-mer comp,ementar^. to the 3 ' end of 
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target DNA as the sequencing primer instead of the duplex probe and the 
annealing of 1 8-mer to its target was carried out in a similar way as the 
annealing of the duplex probe. 

Example 12 Chain mongation of Target Sequences . 
5 Sequencing of immobilized target DNA can be performed 

with Sequenase Version 2.0, A total of 5 elongation reactions, one with 
each of 4 dideoxy nucleotides and one with all four simultaneously, are 
performed, A sequencing solution, containing (40 mM Tris-HCl, pH 7.5, 
20 mM MgCl2, and 50 mM NaCl, 10 mM dithiothreitol solution, 15 mM 

1 0 sodium isocitrate and 10 mM MnCl2, and 100 u/ml of Sequenase (1 .5 units) 
is added to the hybridized target DNA. dATP, dCTP, dGTP and dTTP are 
added to 20 \iM to initiate the elongation reaction. In the separate reactions, 
one of four ddNTP is added to reach a concentration of 8 jiM. In the 
combined reaction all four ddNTP are added to the reaction to 8 )iM each. 

15 The reaction mixtures were incubated at O^^C for 5 minutes room 
temperamre for 5 minutes and 37°C for 5 minutes. After the completion of 
extension, the supernatant was removed and the elongated DNA washed 
with 2 mM EDTA to terminate elongation reactions. Reaction products are 
analyzed by mass spectrometry'. 



20 
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Example 13 Capillary Electrophoretic Analysis of Target Nucleic Acid. 

Molecular weights of target sequences may also be determined 
by capillary electrophoresis. A single laser capillary electrophoresis 
instrument can be used to monitor the performance of sample preparations 
5 in high performance capillary electrophoresis sequencing. This instrument 
is designed so that it is easily converted to multiple channel (wavelengths) 
detection. 

An individual element of the sample array may be engineered 
directly to serve as the sample input to a capillary. Typical capillaries are 

10 250 microns o.d. and 75 microns i,d. The sample is heated or denatured to 
release the DNA ladder into a liquid droplet, the silicon array surfaces is 
ideal for this purpose. The capillary can be brought into contact with the 
droplet to load the sample. 

To facilitate loading of large numbers of samples 

15 simultaneously or sequentially, there are tuo basic methods. With 250 
micron o.d. capillaries it is feasible to match the dimensions of the target 
array and the capillar)' array. Then the two could be brought into contact 
manually or even by a robot arm using a jig to assure accurate alignment. 
An electrode may be engineered directly into each sector of the silicon 

20 surface so that sample loading would only require contact between the 
surface and the capillary array. 

The second method is based on an inexpensive collection 
system to capture fractions eluted from high performance capillars- 
electrophoresis. Dilution is avoided by using designs which allow sample 

25 collection without a perpendicular sheath flow. The same apparatus 
designed as a sample collector can also serv e inversely as a sample loader. 
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In this case, each row of the sample array, equipped with electrodes, is used 
directly to load samples automatically on a row of capillaries. Using either 
method, sequence information is determined and the target sequence 
constructed. 

5 Example 14 Mass Spggtromgtry of Nucleic Agids- 

Nucleic acids to be analyzed by mass spectrometry were 
redissolved in ultrapure water (MilliQ, Millipore) using amounts to obtain 
a concentration of 10 pmoles/jil as stock solution. An aliquot (1 ix\) of this 
concentration or a dilution in ultrapure water was mixed with 1 \i\ of the 

10 matrix solution on a flat metal surface serving as the probe tip and dried 
with a fan using cold air. In some experiments, cation-ion exchange beads 
in the acid form were added to the mixture of matrix and sample solution to 
stabilize ions formed during analysis. 

MALDI-TOF spectra were obtained on different commercial 

15 instruments such as Vision 2000 (Finnigan-MAT), VG TofSpec (Fisons 
Instruments), LaserTec Research (Vestec). The conditions were linear 
negative ion mode with an acceleration voltage of 25 kV„ Mass calibration 
was done exiemally and generally achieved by using defined peptides of 
appropriate mass range such as insulin, gramicidin S, trv'psinogen, bovine 

20 serum albumen and cytochrome C. All spectra were generated by 
employing a nitrogen laser with 5 nanosecond pulses at a wavelength of 337 
nm. Laser energ\' varied between 10*^ £ind 10' W/cm". To improve signal- 
to-noise ratio generally, the intensities of 10 to 30 laser shots were 
accumulated. The output of a typical mass specirometr}* showing 

25 discrimination benveen nucleic acids which differ bv one base is shown in 
Figure 14. 
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Example 15 




Elongation of a target nucleic acid, it, the presence of dideoxy 
Cham terminating nucleotides, genetated four families of cl^in-tetminated 
fragments. TT,e mass difference per nucleotide addition is 289 1 9 for dpC 
5 313.21 for dpA. 329.21 for dpG and 304.20 for dpT, respectively' 
Comparison of the mass differences measured between fragments with the 
known masses of each nucleotide the nucleic acid sequence can be 
determmed. Nucleic acid may also be sequenced by performing polymerase 
cha,n elongation in four separate reactions each with one dideoxy chain 
'0 .ennmating nucleotide. To examine mass differences. 13 oligonucleotides 

frnm '7 en ^ 



MALDI- 



spectrometty. Tie cotrelation of calculated molecular weights of the ddT 
fragments of a Sanger sequencing reaction and their experimentally verified 
wcghts are shown in Table 1 0. When the mass spectrometry, data from all 
four chain termination reactions are combined, the molecular weight 
difference between two adjacent peaks can be use to detennine the 
sequence. 
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Table 10 

Summary of Molecular Weights Expected v. Measured 

Fragment (n-mfr) Calcwlated Mass Experimenf^i m^<:c pifferenr^ 



7-nier 2104.45 2119.9 

lO-mer 3011.04 3026.1 

U-mer 3315.24 3330.1 

19- mer 5771.82 5788.0 

20- mer. 6076.02 6093.8 
10 24-mer 7311,82 7374.9 

26-mer 7945.22 7960.9 

10112.63 10125.3 

^"^-"ler 11348.43 11361.4 

3S-n^er 11652.62 11670.2 

15 42-mer 12872.42 12888.3 

^^-rner 14108.22 14125.0 

15344.02 15362.6 



+15.4 
+15.1 
+ 14.9 
+16.2 
+ 17.8 
+63.1 
+15.7 
+ 12.7 
+ 13.0 
+ 17.6 
+ 15.9 
+ 16.8 
+ 18.6 



Example 16 Reduced Pa ss Segnenrir^p 

To maximize the use of PSBH arrays to produce Sanger 
ladders, the sequence of a target should be covered as completely as possible 
with the lowest amount of initial sequencing redundancy. This will 
maximize the performance of individual elements of the arrays and 
maximize the amount of useful sequence data obtained each time an array 
25 is used. With an unknown DNA, a full array of 1 024 elements (Mwo I or 
BsiVI cleavage) or 256 elements (TspR I cleavage) is used. A 50 kb target 
DNA is cut into about 64 fragments by Mwo I or BsiVI or 30 fragments by 
TspR I, respectively. Each fragment has two ends both of which can be 
captured independently. The coverage of each array after capture and 
30 ignoring degeneracies is 128/1024 sites in the first case and 60/256 sites in 
the second case. Direct use of such an array to blindly deliver samples 
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element by element for mass spectrometry sequencing would be inefficient 
since most array elements will have no samples. 

In one nr^thod, phosphatased double-stranded targets are used 
at high concentrations to saturate each array element that detects a sample. 
5 The target is ligated to make the capture irreversible. Next a different 
sample mixture is exposed to the array and subsequently ligated in place. 
This process is repeated four or five times until most of the elements of the 
array contain a unique sample. Any tandem target-target complexes will be 
removed by a subsequent ligating step because all of the targets are 
10 phosphatased. 

Alternatively, the array may be monitored by confocal 
microscopy after the elongation reactions. This reveals which elements 
contain elongated nucleic acids and this information is communicated to an 
automated robotic system that is ultimately used to load the samples onto a 
1 5 mass spectrometrs* analyzer. 

Example 17 Synthesis of Mass Mo dified Nucleic Acid Primers. 

Mass modification at the 5* sugar: Oligonucleotides were 
synthesized by standard automated DNA synthesis using 6- 
cyanoethylphosphoamidites and a 5'-amino group introduced at the end of 
20 solid phase DNA synthesis. The total amount of an oligonucleotide 
svnthesis, starting w'ith 0.25 micromoles CPG-bound nucleoside, is 
deprotected with concentrated aqueous ammonia, purified via OligoPAK"^^^ 
Carundges (Millipore; Bedford. MA) and lyophilized. This material with a 
5'-terminal amino group is dissolved in 100 \xl absolute N, N- 
25 dimethylfomiamide (DMF) and condensed with 1 0 /^mole N-Fmoc-glycine 
pentafluorophenyl ester for 60 minutes at 25'*C. After eihanol precipitation 
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and centrifiigation, the Fmoc group is cleaved off by a 1 0 minute treatment 
with 100 jLxl of a solution of 20% piperidine in N,N-dimethylformamide. 
Excess piperidine, DMF and the cleavage product from the Fmoc group are 
removed by ethanol precipitation and the precipitate lyophilized from 10 
mM TEAA buffer pH 12. This material is now either used as primer for the 
Sanger DNA sequencing reactions or one or more glycine residues (or other 
suitable protected amino acid active esters) are added to create a series of 
mass- modified primer oligonucleotides suitable for Sanger DNA or RNA 
sequencing. 

Mass modification at the heterocyclic base with glycine: 
Starting material was 5-(3-aminopropynyM)-3'5'-di-p-to!yldeoxyuridine 
prepared and 3* 5'-de-0-acylated (Haralambidis et al., Nuc. Acids Res. 
15:4857-76, 1987). 0,281 g (I.O mmol) 5-(3-aminopropynyM)-2'- 
deoxyuridine were reacted with 0.927 g (2.0 mmol) N-Fmoc-glycine 
pentafluorophenylester in 5 ml absolute N,N-dimethylformamide in the 
presence of 0,129g (I mmol; 174 \x\) N,N-diisopropylethyIamine fpr 60 
minutes at room temperature. Solvents were removed by rotary evaporation 
and the product was purified by silica gel chromatography (Kieselgel 60. 
Merck; column; 2.5 x 50 cm, elution with chloroform/methanol mixtures). 
Yield was 0.44 g (0.78 mmol; 78%). To add another glycine residue, the 
Fmoc group is removed with a 20 minutes treatment with 20% solution of 
piperidine in DMF, evaporated in vacuo and the remaining solid material 
extracted three times with 20 ml ethylacetaie. After having removed the 
remaining ethylacetate, N-Fmoc-glycine pentafluorophenylester is coupled 
as described above. 5-(3(N-Fmoc-gIycyl)-amidopropynyM)-2 -deox>Tiridine 
is transfonned into the 5 -O-dimethoxMriivlalcd nucleoside-3 -O-B- 
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cyanoethyl-N,N-(iiisopropylphosphoaniidite and incorporated into 
automated oligonucleotide synthesis. This glycine modified thymidine 
analogue building block for chemical DNA synthesis can be used to 
substitute one or more of the thymidine/uridine nucleotides in the nucleic 
5 acid primer sequence. The Fmoc group is removed at the end of the solid 
phase synthesis with a 20 minute treatment with a 20% solution of 
piperidine in DMF at room temperature. DMF is removed by a washing 
step with acetonitrile and the oligonucleotide deprotected and purified. 

Mass modification at the heterocyclic base with p-alanine. 
10 0.281 g (1.0 -mmol) 5-(3-AminopropynyI-l).2'-deoxyuridine was reacted 
with N-Fmoc-6-alanine pentafluorophenylester (0.955 g; 2.0 mmol) in 5 ml 
N,N-dimethylformamide (DMF) in the presence of 0.129 g (174 pi; 1.0 
mmol) N,N-disopropylethylamine for 60 minutes at room temperature. 
Solvents were removed and the product purified by silica gel 
15 chromatography. Yield was 0.425 g (0.74 mmol: 74%). Another 6-aIanine 

e same way after removal of the Fmoc 
group. The preparation of the 5'-0-dimethoxytritylated nucleoside-3*-0-6- 
cyanoethyl.N,N-diisopropylphosphoamidite from 5-(3-(N-Fmoc-6-alanvl )- 
amidopropynyl-l)-2'-deoxyuridine and incorporation into automated 
20 oligonucleotide synthesis is performed under standard conditions. This 
building block can substitute for any of the thymidine/uridine residues in the 
nucleic acid primer sequence. 

Mass modification at the heterocyclic base with ethylene 
monomethyl ether: 5-(3.aminopropynyl-l)-2'-deoxyuridine was used as a 
25 nucleosidic component in this example. 7.61 g (100.0 mmol) freshlv 
distilled ethylene glycol monomethyl ether dissolved in 50 ml absolute 
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pyridine was reacted with 10.01 g (100.0 mmol) recrystallized succinic 
anhydride in the presence of L22 g (10.0 mmol) 4-N,N— 
dimethylaminopyridine overnight at room temperature. The reaction was 
terminated by the addition of water (5,0 ml), the reaction mixture evaporated 
5 in vacuo, co-evaporated twice with dry toluene (20 ml each) and the residue 
redissolved in 100 ml dichloromethane. The solution was twice extracted 
successively with 10% aqueous citric acid (2 x 20 ml) and once with water 
(20 ml) and the organic phase dried over anhydrous sodium sulfate. The 
organic phase was evaporated in vacuo. Residue was redissolved in 50 ml 

10 dichloromethane and precipitated into 500 ml pentane and the precipitate 
dried in vacuo. Yield was 13.12 g (74.0 mmol; 74%). 8.86 g (50.0 mmol) 
of succinylated ethylene glycol monomethyl ether was dissolved in 100 ml 
dioxane containing 5% dry pyridine (5 ml) and 6.96 g (50.0 mmol) 4- 
nitrophenol and 10.32 g (50.0 mmol) dicyclohexylcarbodiimide was added 

1 5 and the reaction run at room temperature for 4 hours. Dicyclohexylurea was 
removed by filtration, the filtrate evaporated in vacuo and the residue 
redissolved in 50 ml anhydrous DMF. 12.5 ml (about 12.5 mmol 4- 
nitrophenylester) of this solution was used to dissolve 2.81 g (10.0 mmol) 
5-(3-aminopropynyM)-2'-deoxyuridine. The reaction was performed in the 

20 presence of 1.01 g (10.0 mmol; 1,4 ml) triethylamine overnight at room 
temperature. The reaction mixture was evaporated in vacuo, co-evaporated 
with toluene, redissolved in dichloromethane and chromatographed on 
silicagel (Si60, Merck; column 4 x 50 cm) with dichloromethane/methanol 
mixtures. Fractions containing the desired compound were collected, 

25 evaporated, redissolved in 25 ml dichloromethane and precipitated into 250 
ml pentane. The dried precipitate of 5-(3-N-(0'Succinyl ethylene glycol 
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monomethyl ether)-amidopropynyM)-2'.deoxyuridine (yield 65%) is S'-O- 
dimethoxytritylated and transformed into the nucleoside-S'-O-B-cyanoethyl- 
N, N-diisopropylphosphoamidite and incorporated as a building block in the 
automated oligonucleotide synthesis according to standard procedures. The 
5 mass-modified nucleotide can substimte for one or more of the 
thymidine/uridine residues in the nucleic acid primer sequence. 
Depi-otection and purification of the primer oligonucleotide also follows 
standard procedures. 

Mass modification at the heterocyclic base with diethylene 
1 0 glycol monomethyl ether: Nucleosidic starting material was as in previous 
examples, 5-(3-aminopropynyl-l)-2'-deoxyuridine. 12.02 g (100.0 mmol) 
freshly distilled diethylene glycol monomethyl ether dissolved in 50 ml 
absolute pyridine was reacted with 10.01 g (100.0 mmol) reciystallized 
succinic anhydride in the presence of 1.22 g (10.0 mmol) 4-N, N- 
1 5 dimethylaminopyridine (DMAP) overnight at room temperamre. Yield was 
18.35 g (82.3 mmol; 82.3%). 11.06 g (50,0 mmol) of succinvlated 
diethylene glycol monomethyl ether was transformed into the 4- 
niu-ophenylester and. subsequenUy. 12.5 mmol was reacted with 2.8 1 g ( 1 0.0 
mmol) of 5-(3-aminopropynyl -l)-2'-deoxyuridine. Yield after silica gel 
20 column chromatography and precipitation into pentane was 3.34 g (6.9 
mmol; 69%). After dimethox>irit>Iation and transformation into the 
nucleoside-B-cyanoethylphosphoamidite, the mass-modified building block 
is incorporated into automated chemical DNA synthesis. Within the 
sequence of the nucleic acid primer, one or more of the thymidine/uridine 
25 residues can be substituted by this mass-modified nucleotide. 
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Mass ModiHcation at the heterocyclic base with glycine- 

Starting material was N'-benzoyl-8-bromo.5'.0.(4.4'.dimetl,oxytrityI>2'- 
deoxyadenosii,e(Singi,etal.,Nuc. Acids Res. 18:3339-45, 1990). 6325 m8 
(1.0 mmol) of this S-btomc-deoxyadeaosine derivative was suspended in 3 
5 ml absolute ethanol and reacted with 25 1 .2 mg (2.0 mmol) glycine methyl 
ester (hydrochloride) in the presence of 24 1 .4 mg (2. 1 mmol; 366 m) N N- 
dusopropylethylamine and refluxed until the starting nucleosidic material 
had disappeared (4-6 hours) as checked by dtin layer chromatogranhy 
(TLC). The solvem was evaporated and the residue purified by silica gel 
10 Chromatography (column 2.5 x 50 cm) using solvent mixtures of 
chloroform/methanol containing 0.1% pyridine. Product fractions were 
combined, the solvent evaporated, the fractions dissolved in 5 ml 
d,chloromethane and precipitated into 100 ml pentane. Yield was 487 mg 
(0.76 mmol; 76%). Transfonnation in.o .he coTesponding nucleoside-0- 
> 5 cyanoethylphospho amidite and integration in,o automated chemical DNA 
synthesis is perfonned under standard conditions. During final deprotection 
wtth aqueous concentrated ammonia, the methyl group is removed from the 
glycine moieo-. The mass-modified building block can substimte one or 

more deoxyadenosine/adenosine residues in the nucleic acid primer 

20 sequence. 

Mass modincation at the heterocyclic base with 
glycylglvcine: 632.5 mg (1.0 mmol) N«-Benzoyl-8-br0mo-5'-O- 
(4.4'd,meethoxv.ri.yl)2'-deoxyadenosine was suspended in 5 ml absolute 
ethanol and reacted with 324.3 mg (2.0 mmol) glycyl-glycine methvl ester 
-5 m d,e presence of 24 1 ,4 mg (2. 1 mmol; 366 N, N-diisopropvlethylamine 
The mixture was refiuxed and completeness of the reaction checked bv 
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TLC. Yield after silica gel column chromatography and precipitation into 
pentane was 464 mg (0.65 mmol; 65%). Transformation into the 
nucleoside-6-cyanoethylphosphoamidite and into synthetic oligonucleotides 
is done according to standard procedures. 
5 Mass Modiflcation at the heterocyclic base with glycol 

monomethyl ether: Starting material was 5*-0-(4,4-dimethoxytrityI)-2'- 
amino-2 -deoxythymidine synthesized (Verheyden et al,, J. Org. Chem. 
36:250-54, 1971; Sasaki et al, J. Org. Chem. 41:3138-43, 1976; Imazawa et 
al., J. Org. Chem. 44:2039-41, 1979; Hobbs et al., J. Org. Chem. 42:714-19. 

1 0 1976; Ikehara et al., Chem. Pharm. Bull. Japan 26:240-44, 1 978). 5*-0-(4,4- 
DimethoxytrityI)-2'-amino-2'-deoxythymidine (559.62 mg; 1.0 mmol) was 
reacted with 2.0 mmol of the 4-nitrophenyl ester of succinylated ethylene 
glycol monomethyl ether in 10 ml dry DMF in the presence of 1.0 mmol 
(140 fj.1) triethylamine for 18 hours at room temperature. The reaction 

1 5 mixture was evaporated in vacuo, co-evaporated with toluene, redissolved 
in dichloromethane and purified by silica gel chromatography (Si60, Merck: 
column: 2.5 x 50 cm; eluent: chloroform/methanol mixtures containing 0. l^/o 
triethylamine). The product containing fractions were combined, evaporated 
and precipitated into pentane. Yield was 524 mg (0.73 mmol; 73%). 

20 Transformation into the nucleoside-B-cyanoethyl-N.N— 
diisopropylphosphoamidite and incorporation into the automated chemical 
DNA synthesis protocol is performed by standard procedures. The mass- 
modified deoxvthvmidine derivative can substitute for one or more of the 
thymidine residues in the nucleic acid primer. 

25 In an analogous way, by employing the 4-nitrophenyl ester of 

succinylated diethylene glycol monomethyl ether and triethylene glycol 
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monomethyl ether, the coiresponding mass-modified oligonucleotides are 
prepared. In the case of only one incorporated mass-modified nucleoside 
within the sequence, the mass difference between the ethylene, diethylene 
and triethyiene glycol derivatives is 44.05, 88.1 and 132.15 daltons, 
5 respectively. 

Mass modification at the heterocyclic base by alkylation 

Phosphorothioate-containing oligonucleotides were prepared (Gait et al 
Nuc. Acids Res. 19:1183. 1991). One, several or all intemucleotide linkages' 
can be modified in this way. The(-)M13 nucleic acid primer sequence (17- 
1 0 mer) 5--dGTAAAACGACGGCCAGT (SEQ ID NO 29) is synthesized in 
0.25 ;.mole scale on a DNA synthesizer and one phosphorothioate group 
introduced af^er the final synthesis cycle (G to T coupling). Sulfi^rization, 
deprotection and purification followed standard protocols. Yield was 3 1 .4 
nmole (12.60/0 overall yield), corresponding to 3 1 .4 nmole phosphorothioate 
' 5 groups. Alkylation was performed by dissolving the residue in 3 1 4 ^1 TE 
buffer (0.01 M Tris pH 8.0, 0.001 M EDTA) and by adding ,6 ^1 of a 
solution of 20 mM solution of 2-iodoethanol (320 nmole; 10-fold excess 
with respect to phosphorothioate diesters) in N.N-dimethvlformamide 
(DMF). The alkylated oligonucleotide was purified by standard reversed 
20 phase HPLC (KP-lg Ultraphere, Beckman; column: 4.5 x 250 mm: lOQ mM 
tnethyl ammonium acetate, pH 7.0 and a gradient of 5 to 40% acetonitrile). 

In a variation of this procedure, the nucleic acid primer 
containing one or more phosphorothioate phosphodiester bond is used in the 
Sanger sequencing reactions. The primer-extension products of the four 
5 sequencing reactions are purified, cleaved off the solid support. Ivophihzed 
and dissolved in 4 ^1 each of IT buffer pH 8.0 and alkylated bv addition of 
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2 /^I of a 20 mM solution of 2-iocioethanol in DMF. It is then analyzed by 
ES and/or MALDI mass spectrometry. 

In an analogous way, employing instead of 2-iodoethanoU e.g., 

3 iodopropanol, 4-iodobutanol mass-modified nucleic acid primer are 
5 obtained with a mass difference of 14.03, 28.06 and 42.03 daltons 

respectively compared to the unmodified phosphorothioate phosphodiester- 
cohtaining oligonucleotide. 

Example 1 8 Mass Modification of Nuglgptide Triphosphates- 
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Mass modification of nucleotide triphosphates at the 2' and 
3' amino function: Starting material was 2'-azido-2'-deoxyuridme prepared 
according to literature (Verheyden et al., J. Org. Chem. 36:250, 1971), 
which was 4.4- dimethoxytritylated at 5'-OH with 4,4-dimethoxytrityl 
5 chloride in pyridine and acetylated at 3'-0H with acetic anhydride in a one- 
pot reaction using standard reaction conditions. With 191 mg (0.71 mmol) 
2'-azido-2'-deoxyuridine as starting material, 396 mg (0.65 mmol; 90.8%) 

5'-0-(4,4-dimethoxytrityl)-3'-0-acetyi-2Va2ido-2'-deoxyuridine was 
obtained after purification via silica gel chromatography. Reduction of the 
10 azido group was performed (Barta et al.. Tetrahedron 46:587-94, 1990). 
Yield of 5'-0-(4,4.dimethoxytrityl)-3'-0-acetyl-2'.amino-2'-deoxyuridine 
after silica gel chromatography was 288 mg (0.49 mmol; 76%). This 
protected 2'-amino-2'-deoxyuridine derivative (588 mg, 1.0 mmoi) was 
reacted with 2 equivalents (927 mg; 2.0 mmol) N-Fmoc-glycine 
15 pentaHuorophenyl ester in 10 ml dry DMF overnight at room temperature 
in the presence of 1 .0 mmol ( 1 74 ^l) N,N-diisopropy lethy lamine. Solvents 
were removed by evaporation in vacuo and the residue purified by silica gel 
chromatography. Yield was 71 1 mg (0.71 mmol; 82%). Detritylation was 
achieved by a one hour treatment with 80% aqueous acetic acid at room 
20 temperature. The residue was evaporated to dryness, co-evaporated twice 
with toluene, suspended in 1 ml dry acetonitrile and 5'-phosphorylated with 
POCI3 and directly transformed in a one-pot reaction to the 5'-triphosphate 
using 3 ml of a 0.5 M solution (1.5 mmol) tetra (tri-n-butylammonium) 
pyrophosphate in DMF according to literature. The Fmoc and the 3'-0- 
25 acetyl groups were removed by a one-hour treatment with concentrated 
aqueous ammonia at room temperature and the reaction mi.xture evaporated 
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and lyophilized. Purification also followed standard procedures by using 
anion-exchange chromatography on DEAE Sephadex with a linear gradient 
of triethylammonium bicarbonate (0.1 M - 1.0 M), Triphosphate containing 
fractions, checked by thin layer chromatography on polyethyleneimine 
5 cellulose plates, were collected, evaporated and lyophilized. Yield by UV- 
absorbance of the uracil moiety was 68% or 0.48 mmol. 

A glycyl-glycine modified 2 -amino-2'-deoxyuridine-5*- 
triphosphate was obtained by removing the Fmoc group from 5'-0-(4,4- 
dimethoxytrityi)-3*-0-acetyl-2*-N(N-9-fluorenylmethyloxycarbonyl-glycyl)- 

1 0 2'-amino-2*-deoxyuridine by a one-hour treatment with a 20% solution of 
piperidine in DMF at room temperature, evaporation of solvents, two- fold 
co-evaporation with toluene and subsequent condensation with N-Fmoc- 
glycine pentafluorophenyl ester. Starting with 1 .0 mmol of the 2'-N-g]ycyl- 
2'-amino-2*-deoxyuridine derivative and following the procedure described 

15 above, 0.72 mmol (72%) of the corresponding 2*-(N-glycyl-glycyl)-2'- 
amino-2'-deoxyuridine-5*triphosphate was obtained. 

Starting with 5 -0-(4.4-dimethoxytrity l)-3 -0-acety l-2*-amino- 
2'deoxyuridine and coupling with N-Fmoc-B-alanine pentafluorophenyl 
ester, the corresponding 2'-(N-B-alanyl)-2'-amino-2'-deoxyuridine-5'- 

20 triphosphate are synthesized. These modified nucleoside triphosphates are 
incorporated during the Sanger DNA sequencing process in the primer- 
extension products. The mass difference between the glycine, C-alanine and 
glycyl-glycine mass-modified nucleosides is, per nucleotide incorporated. 
58.06, 72.09 and 1 15.1 daltons, respectively. 

25 When starting with 5*-0-(4.4-dimethox\trityl)-3'-amino-2'J' I - 

dideoxjthymidine, the corresponding 3*-CN-glycyl)-3'-amino-. 3y-N-glycyl- 
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gIycyI)-3'.amino., and 3'KN-fi-aIanyl)0^amino.2' S'-dideoxythymidine-S'- 
tnphosphates can be obtained. Tl,ese mass-modified nucleoside 
tnphosphates serve as a terminating nucleotide unit in the Sanger DNA 
sequencing reactions providing a mass difference per terminated fragment 
of 58.06, 72.09 and 115.1 daltons respectively when used in the 
multiplexing sequencing mode. The mass-difTerentiated fragments are 



MALDI 



Mass modiflcaflon of nudeoHde triphosphates at C-5 of the 
heterocyclic base: 0.281 g (1.0 mmol) 5-<3-Aminopropy„yl.l)-2'- 
1 0 deoxyuridine was reacted with either 0.927 g (2.0 mmol) N-Fmoc-glycine 
pentanuorophenylester or 0.955g (2.0 mmol) N.Fmoc.B.ala„i„e 
pentafluorophenyl ester in 5 ml dry DMF to the presence of 0.129 g N N- 
diisopropylethylamine (174 Ml, 1.0 mmoi) overnight at room temperature 
Solvents were removed by evaporation m ..cue and the condensation 
1 5 products purined by flash chromatography on silica gel (Still et al. J Org 
Chem. 43: 2923-25, 1978). Yields were 476 mg (0.85 mmol: 850»/.) for the 
glycine and 436 mg (0.76 mmol; 76%) for the fi-alanine derivatives. For ,hc 
synthesis of the glycyl-glycine derivative, the Fmoc group of 1 .0 mmol 
Fmoc-glycincKieoxyuridine derivative was removed by one-hour treatment 
20 w„h 20% piperidine in DMF at room temperature. Solvents were removed 
by evaporation in vacuo, the residue was coevaporated twice with toluene 
and condensed with 0.927 g (2.0 mmol) N-Fmoc-glycine pentafluorophenvl 
ester and purified as described above. Yield was 445 mg (0.72 mmol: 72%). 
The glycyl-, glycyl-glycyl- and 6-alanyl-2'.deoxyuridine derivatives N- 
25 protected with the Fmoc group were transfonned to the 3'.0-ace,vl 
denvatives by trit,-lation with 4.4-dimethoxy.rityl chloride in pvridine and 



wo 96/32504 



PCT/US96/05136 



90 

acetylation with acetic anhydride in pyridine in a one-pot reaction and 
subsequently detrityiated by one hour treatment with 80% aqueous acetic 
acid according to standard procedures. Solvents were removed, the residues 
dissolved in 1 00 ml chloroform and extracted twice with 50 ml 1 0% sodium 
5 bicarbonate and once with 50 ml water, dried with sodium sulfate, the 
solvent evaporated and the residues purified by flash chromatography on 
silica gel. Yields were 361 mg (0.60 mmol; 71%) for the glycyl-, 351 mg 
(0.57 mmol; 75%) for the B-alanyl- and 323 mg (0,49 mmol; 68%) for the 
gIycyl-glycyl-3-0'-acetyl-2'-deoxyuridine derivatives, respectively. 

10 Phosphorylation at the 5 -OH with POCI3, transformation into the 5*- 
triphosphate by in situ reaction with tetra(tri-n-butylammonium) 
pyrophosphate in DMF, 3'-de-0-acetylation, cleavage of the Fmoc group, 
and final purification by anion-exchange chromatography on DEAE- 
Sephadex was performed and yields according to UV-absorbance of the 

15 uracil moiety were 0.41 mmol 5-(3-(N-glycyl)-amidopropynyI-l)-2'- 
deoxyuridine-5'-triphosphate (84%), 0.43 mmol 5-(3-(N-fl-alanyl)- 
amidopropynyl-l)-2'-deoxyuridine-5'-triphosphate (75%) and 0J8 mmol 5- 
(3-(N-glycyl-glycyl)-amidopropynyl-l)-2'-deoxyuridine-5'-triphosphate 
(78%). These mass-modified nucleoside triphosphates were incorporated 

20 during the Sanger DNA sequencing primer-extension reactions. 

When using 5-(3-aminopropynyl)-2',3*-dideoxyuridine as 
starting material and following an analogous reaction sequence the 
corresponding glycyl-, glycyl-glycyl-and 6-alanyU2\3'-dideoxyuridine-5*- 
triphosphaies were obtained in yields of 69%, 63% and 71%, respectively. 

25 These mass-modified nucleoside triphosphates serve as chain-terminating 
nucleotides during the Sanger DNA sequencing reactions. The mass- 
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modified sequencing ladders are analyzed by either ES or MALDI mass 
spectrometiy. 

Mass modification of nucleotide triphosphates: 727 mg 

(1.0 nrniol) of N*-(4-tert-butylphenoxyacetyl)-8-glycyl-5'-(4,4- 
5 dimethoxytrityI)-2'- deoxyadenosine or 800 mg (1.0 mmol) N<'-(4-tert- 

butylphenoxyacetyl)-8-glycyl-g]ycyI-5'-(4,4-dimethoxytrityI)-2'. 
deoxyadenosine prepared according to literature (KOster et al.. Tetrahedron 
37:362, 1981) were acetylated with acetic anhydride in pyridine at the 3'- 
OH, detritj lated at the 5'-position with 80% acetic acid in a one-pot reaction 
10 and transformed into the 5'-triphosphates via phosphorylation with POCI3 
and reaction in situ with tetra(tri.n-butylammonium) pyrophosphate. 
Deprotection of the tert-butylphenoxyacetyl, the 3'-0-acetyl and the O- 
methyl group at the glycine residues was achieved with concentrated 
aqueous ammonia for ninet>' minutes at room temperature. Ammonia was 
1 5 removed by lyophilization and the residue washed with dichloromethane, 
solvent removed by evaporation in vacuo and the remaining solid material 
purified by anion-exchange chromatography on DEAE-Sephadex using a 
linear gradient of triethylammonium bicarbonate from 0.1 to 1.0 M The 
nucleoside triphosphate containing fractions (checked by TLC on 
20 pplyethyleneimine cellulose plates) were combined and lyophilized. Yield 
of the 8-glycyl-2'-deoxyadenosine-5'-triphosphate (determined by UV- 
absorbance of the adenine moiety) was 57% (0.57 mmol). The yield for the 
8-glycyl-glycyl.2'-deoxyadenosine-5'-triphosphate was 5 1 % (0.5 1 mmol). 
These mass-modified nucleoside triphosphates were incorporated during 
25 primer-extension in the Sanger DNA sequencing reactions. 
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When using the corresponding N6-(4-tert- 
butyIphenoxyacetyl>8-gIycyl- or -glycyl-glycyl-5 -0-(4,4-dimethoxytrityI)- 
2',3 -dideoxyadenosine derivatives as starting materials (for the introduction 
of the 2',3 "function: Seela et al., Helvetica Chimica Acta 74: 1 048-58, 1 99 1 ). 
5 Using an analogous reaction sequence, the chain-terminating mass-modified 
nucleoside triphosphates 8-gIycyl- and 8-giycyl-glycyl-2'J*- 
dideoxyadenosine-5 -triphosphates were obtained in 53 and 47% yields, 
respectively. The mass-modified sequencing fragment ladders are analyzed 
by either ES or MALDI mass spectrometry. 
10 Example 19 Mass Modification ofNucleotides bv Alleviation After Sanger 

Sgqygnging- 

2',3 -Dideoxythymidine-5'-(alpha-S)-triphosphate was 
prepared according to published procedures (for the alpha-S-triphosphate 
moiety: Eckstein et al.. Biochemistry 15:1685, 1976) and Accounts Chem. 

15 Res. 12:204, 1978) and for the 2\3*-dideoxy moiety: Seela et al., Helvetica 
Chimica Acta 74:1048-58, 1991). Sanger DNA sequencing reactions 
employing 2'-deoxythymidine-5'-(alpha-S)-triphosphate are performed 
according to standard protocols. When using 2\3*-dideoxythymidine-5'- 
(alpha-S)-triphosphates, this is used instead of the unmodified 2\3*- 

20 dideoxythymidine-5 -triphosphate in standard Sanger DNA sequencing. The 
template (2 picomole) and the nucleic acid M13 sequencing primer (4 
picomole) are annealed by heating to 65 ''C in 100 |al of 10 mM Tris-HCl, 
pH 7.5, 10 mM MgCU, 50 mM NaCL 7 mM dithiothreitol (DTT for 5 
minutes and slowly brought to 37''C during a one hour period. The 

25 sequencing reaction mixtures contain, as exemplified for the T-specific 
termination reaction, in a final volume of 150 200 |iM (final 
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concentration) each of dATP, dCTT, dTTP, 300 nM cT-deaza-dGTP, 5 ^iM 
2',3'dideoxythymidine-5'-(alpha-S)-triphosphate and 40 units Sequenase. 
Polymerization is performed for 10 minutes at 37" C, the reaction mixture 
heated to 70 to inactivate the Sequenase, ethanol precipitated and coupled 
5 to thiolated Sequelon membrane disks (8 mm diameter). Alkyiation is 
performed by treating the disks with 10 m1 of 10 mM solution of either 2- 
iodoethanol or 3-iodopropanol in NMM (N-methylmorpholine/water/2- 
propanol, 2/49/49, v/v/v) (three times), washing with 10 nl NMM (three 
times) and cleaving the alkylated T-terminated primer-extension products 
10 off the support by treatment with DTT. Analysis of the mass-modified 
fragment families is performed with either ES or MALDI mass 
spectrometry. 

Example 20 Mass Modification of nfi gonuclfiotiH^ 

This method, in addition to mass modification, also modifies 
15 the phosphate backbone of the nucleic acids to a non-ionic polar form. 
Oligonucleotides can be obtained by chemical synthesis or by enzymatic 
synthesis using DNA polymerases and a-thio nucleoside triphosphates. 

This reaction was performed using DMT-TpT as a starting 
material but the use of an oligonucleotide with an alpha thio group is also 
20 appropriate. For thiolation. 45 mg (0.05 mM) of compound 1 (Figure 15), 
is dissolved in 0.5 ml acetonitrile and thiolated in a 1.5 ml tube with 1.1- 
diozo-l-H-benzo[l,2]dithio-3-on (Beaucage reagent). The reaction was 
allow to proceed for 10 minutes and the produce is concentrated by thin 
layer chromatography with the solvent system dichloromethane/96% 
25 ethanol/pyridine (870/0/1 3%/1 0/0; v/v/v). The thiolated compound 2 (Figure 
15) is deprotected by treatment with a mixture of concentrated aqueous 
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ammonia/acetonitrile (1/1; v/v) at room temperature. This reaction is 
monitored by thin layer chromatography and the quantitative removal of the 
beta-cyanoethyi group was accomplished in one hour. This reaction mixture 
was evaporated in vacuo. 
5 To synthesize the S-(2-amino-2-oxyethyl)thiophosphate 

triester of DMT-TpT (compound 4), the foam obtained after evaporation of 
the reaction mixture (compound 3) was dissolved in 0.3 mi 
acetonitrile/pyridine (5/1; v/v) and a 1.5 molar excess of iodoacetamide 
added. The reaction was complete in 10 minutes and the precipitated salts 

10 were removed by centrifiigation. The supernatant is lyophilized, dissolved 
in 0.3 ml acetonitrile and purified by preparative thin layer chromatography 
with a solution of dichloromethane/96% ethanol (85%/15%; v/v). Two 
fractions are obtained which contain one of the two diastereoisomers. The 
two forms were separated by HPLC. 

1 5 Example 2 1 MALDl-MS Analysis of a Mass-Modified Oligonucleotide . 

A 17-mer was mass modified at C-5 of one or two 
deoxyuridine moieties. 5-[ 1 3-(2-Methoxyethoxy l)-tridecyne- 1 -y 1 ]-5 
(4,4 '-dime thoxytrity l)-2'-deoxyuridine-3'-p- cyan oethyl-N,N- 
diisopropylphosphoamidite was used to synthesize the modified I7-mers. 
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The modified 1 7-niers were: 

X 

I 

5 d (TAAAACGACGGCCAGUG) (molecular mass: 5454) (SEQ ID NO 30) 

X X 

I 1 

d (U AAAACGCGGCC AGUG) (molecular mass 5634) ( SEQ I D NO 3 1 ) 

10 

where X = -eEC-(CH2),t-0H 
(unmodified I7-mer: molecular mass: 5273) 

The samples were prepared and 500 fmol of each modified IT- 
IS mer was analyzed using MALDI-MS. Conditions used were reflectron 
positive ion mode with an acceleration of 5 kV and post-acceleration of 20 
kV. The MALDI-TOF spectra which were generated were superimposed 
and are shown in Figure 16. Thus, mass modification provides a distinction 
detectable by mass spectrometry which can be used to identify base 
20 sequence information. 

Example 22 Capture and Sequencin g of a Double-Stranded Target Nucleic 

Agid- 

In another experiment, a nucleic acid was captured and 
25 sequenced by strand-displacement polymerization. This reaction is shown 
schematically in Figure 17, Double-stranded DNA target was prepared by 
PCR and attached to magnetic beads as described in Example 6. EcoR I 
digested plasmid NB34 was used as the DNA template for amplification. 
NB34 comprises a PCR™ II plasmid (Invitrogen) with a one kb target 
30 human DNA insert. PCR was performed with an I6-nucleotide upstream 
primer (primer I, 5'-AACAGCTATFACCATG-3'; SEQ ID NO, 32), and a 
downstream 5'-end biotinylated 18-nucleotide primer (primer IK S'-biotin- 



4 
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CTGAATTAGTCAGGTTGG-3'; SEQ ID NO. 33), Five hundred basepair 
PGR products, containing a single BstX I site, were immobilized by 
attachment to magnetic beads which were resuspended in a total of 300 jil 
reaction buffer containing 200 units of BstX I restriction endonuclease 
5 (Boehringer Mannheim; Indianapolis, IN), 50 mM Tris-HCI pH 7.5, 10 mM 
MgClj, 100 mMNaCl and 1 mM dithiothreitoL The mixture was incubated 
at 45*C for three hours or until digestion was complete which was 
monitored by agarose gel electrophoresis. After digestion, magnetic beads 
were washed twice with 300 \x\ of TE to remove digested and non- 
10 immobilized fragments, excess nucleotides and restriction endonuclease. 

This immobilized DNA was dephosphorylated by 
resuspending the beads in 100 \x\ buffer (500 mM Tris-HCI, pH 9.0, 1 mM 
MgCK, 0.1 mM ZnCK, and 1 nnM spermidine) containing five units of calf 
intestinal alkaline phosphatase (Promega; Madison, WI). The reaction was 
15 incubation at Zl^'C for 15 minutes and at 56^0 for 15 minutes. Five 
additional units of calf intestinal alkaline phosphatase was added and a 
second incubation was performed at 37*^0 for 15 minutes and at 56'*C for 
15 minutes. Beads were washed twice with TE and resuspended in 300 ^1 
of fresh TE containing 1 M NaCl. 
20 Loading of the beads was checked by incubating 10 \x\ of the 

beads with 10 ^1 of formamide at 95''C for 5 minutes (or by boiling in TE), 
The mixture was analyzed by 1% agarose gel electrophoresis with ethidium 
bromide staining. A 10 jil bead aliquot generally contains about 80 ng of 
immobilized double stranded DNA. 
25 A partial duplex DNA probe containing a four base 3' 

overhang was used as a sequencing primer and was ligated with BstX I 
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digested DNA firagments which were immobilized on magnetic beads. The 
partial duplex had a 5'-fluorescein labeled 23 met (DF25-5F) containing a 
5' base paring region and a 4-base 3' single stranded region (which is 
complementary to the sequence of the 5*-protruding end of the 
5 corresponding BstXl digested target DNA as prepared above and a 19 mer 
(G-CMl) complementary to the base pairing region of the 23 mer. The 19 
mer was 5' phosphorylated by the T4 DNA Polymerase and armealed to the 
corresponding 23 mer in TE at the same molar ratio. Beads, prepared from 
alkaline phosphatase treatment which have about 10 pmol immobilized 

10 DNA template, were ligated to 25 pmol of partially duplex probe in an 100 
|il volume containing 200 units of T4 DNA ligase (New England Biolabs; 
Beverly, MA), 50 mM Tris-HCl, pH 7.8, 10 mM MgCl^, 10 mM 
dithiothreitol, 1 mM ATP, 25 \ig/ml bovine serum albumin. Ligation 
reactions were performed at room temperature for tv^o hours or 4°C 

15 overnight. Beads were washed twice with TE and resuspended in 300 ^1 of 
the same buffer. 

Sequencing reactions: Thirty |ai of beads containing the 
ligation product were used for each sequencing reaction. Beads were 
resuspended in a 13 pil volume containing 1 .5 jul of 1 0 x Klenow buffer ( 1 00 

20 mM Tris-HCl, pH 7.5, 50 mM MgCl^, and 75 mM dithiothreitol) and with 
or without one ^1 of single stranded DNA binding protein (SSB, 5 |ig/pl; 
USB; Cleveland, Ohio). Mixtures were incubated on ice for 5 minutes 
followed with the addition of 5 units of Klenow Fragment (New England 
Biolabs), The reaction volume was split into four termination mixes, each 

25 consisting of I )il DMSO and 3 jal of the appropriate termination mixture. 
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Termination mixtures were made in Klenow buffer and comprise the 



nucleotide concentrations shown below in Table 11. 

Table 11 



Termination 


dATP 


dGTP 


dCTP 


dTTP 


ddNTPs 


Mix 


i n mM 


in mM 


in mM 


in mM 




ddATP mix 


10 


100 


100 


100 


1 00 mM ddATP 


ddGTP mix 


100 


5 


100 


100 


1 20 mM ddGTP 


ddCT? mix 


100 


100 


10 


100 


1 00 mM ddCTP 


ddriP mix 


100 


100 


100 


5 


500 mM ddTTP 



10^ 

Termination mixtures were incubated for 20 minutes at 
ambient temperature. Two fil of chase solution (0.5 mM of each of four 
dNTPs in Klenow buffer) were added to each reaction tube and mixtures 
were incubated for another 15 minutes, again at ambient temperature. 

1 5 Magnetic beads were precipitated with a magnetic particle concentrator (or 
centrifUgation) and the supernatant discarded. Beads were resuspended in 
a solution containing 10 ji! of deibnized formamide, 5 mg/ml dextran blue 
and 0.1% SDS, and heated to 95 "^C for 5 minutes, and stored on ice for less 
than 10 minutes. Samples were analyzed on a DNA sequencing gel and on 

20 an ALF DNA sequencer (Pharmacia; Piscataway, NJ) using a 6% 
polyacrylamide gel with 7 M urea and 0.6 x TBE. Surprisingly, sequencing 
reactions performed in the presence of single-stranded DNA binding protein 
showed considerable improvement in resolution. Only 50 bases were 
resolved from reactions performed without single-stranded DNA binding 

25 protein (Figure 1 8, bouom panel) whereas 200 bases could be resolved from 
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reactions performed in the presence of single-stranded DNA binding protein 
(Figure 18, top panel). 

Example 23 Specificity of Double-Stn.nH Seouencino hy 
DisplacemeTit 

Another experiment was perfonned to deteimine the 
specificity and applicability of the nick translation strand displacement 
method of siequencing double-stranded nucleic acids. A schematic of the 
experimental design is shown in Figure 19. Briefly, a double-stranded tu-get 
DNA was prepared by digesting double-stranded 4>X1 74 phage DNA with 
TspR I restriction endonuclease. TspR I has a recognition site of 
NNCAGTGNN and cleaves $X174 into 12 fragments each with distinctive 
3' protruding ends. Possible ends are shown in Table 12. 



3 
4 



5'-AACACTGAC-3 



5'-AACAGTGGA-3 
5'-ACCACTGAC-3' 



5'-AACACTGGT-3 



Table 12 

7 
8 
9 
10 



5-ATCAGTGAC-3 



5'-ACCAGTGTT-3 



11 



12 



5'-GTCAGTGTT-3 
5'-GTCAGTGGT-3 ' 
5'-GTCACTGAT-3 
5'-TCCACTGTT-3' 



5'-TGCAGTGGA-3' 



5'-TCCACTGCA-3' 



25 



^X174 DNA (5 pmol) was dephosphoiylated using calf 
intestinal alkaline phosphatase. Briefly, ^>X174 DNA was resuspended in 
100 Hi buffer (500 mM Tris-HCl pH 9.0, I mM MgCI,, 0. 1 mM ZnCU, and 
1 mM spermidine) containing 5 units of calf intestinal alkaline phosphatase 
(Promega; Madison, WI). The reaction was incubation at 37°C for 15 
minutes and at 56»C for 15 minutes. Five additional units of calf intestinal 
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alkaline phosphatase was added and a second incubation was performed at 
37 ""C for 15 minutes and at 56 ''C for 15 minutes. DNA in the samples was 
extracted once with phenol, once with phenol/chloroform, and once with 
chloroform, after which nucleic acid was precipitated in 0.3 M sodium 
5 acetate/2.5 volumes ethanoL Precipitated ^XI74 DNA was washed twice 
with TE and resuspended in 300 ^l of TE containing 1 M NaCl. 

Double-stranded probes, comprising biotin (B), fluorescein 
(F), and infia dye (CY5) labels, were synthesized and anchored to magnetic 
beads as shown in Table 13, 
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DF27-I 



DF27.2 



DF27-3 



bF27-4 



101 

Table 13 

S-F-GATGATCCGACGCATCACATCAGTGACO- 
3'B-CTACTAGGCTGCGTAGTG.p.S ' 

5F-GATGATCCGACGCATCACTCC 
3-B-CTACTAGGCTGCGTAGTG-P. 5' 

5F-GATGATCCGACGCATCACGTCAGTGTT.3' 
3'B.CTACTAGGCTGCGTAGTG-p -5- 

S-F-GATGATCCGACGCATCACTGCAGTGGAO- 
3'B-CTACTAGGCTGCGTAGTG-p-S- 



-3' 



DF27.5-CY5 ^'CYS-GATGATCCGACGCATCACGTCACTGATO- 
— 3-B-CTACTAGGCTGCGTAGTG-P- 5- 

DF27-6-CY5 S-CY5-GATGATCCGACGCATCACAACAGTGGA-3' 

' 3-B-CTACTAGGCTGCGTAGTG-P.5 - 

5--F-GATGATCCGACGCATCACGTCAGTGGT-3' 
3-B-CTACTAGGCTGCGTAGTC-P-5' 



DF27-7 



DF27-8 



DF27-9 



'-F-GATGATCCGACGCATCACAACACTGGT-3' 
3-B.CTACTAGGCTGCGTAGTG.p-S- 



CSEQ ID NO. 34) 
(SEQ ID NO. 35 ) 
(SEQ ID NO. 36) 
(SEQ ID NO. 37) 
(SEQ ID NO. 38) 
(SEQ ID NO. 39) 

(SEQ ED NO. 40) 
(SEQ ID NO. 41) 
(SEQ ID NO. 42) 
(SEQ ID NO 43) 



(SEQ ED NO 44) 
(SEQ ID NO . 45) 

(SEQ ED NO. 46) 
(SEQ ID NO 47) 



DF27-I0 



5-- F-G ATCATCCC AGOG ATC ACAAG AGTGAC-3 • 
3-B.CTACTAGGGTCCCTAGTG-p.S- 



5--F-GATGATCCGACGCATCACACCACTGAC-3- 
3'B-CTACTAGGCTGCGTAGTG-p.5- 



(SEQ ID NO 48) 
(SEQ ID NO 49) 



(SEQ ID NO 50) 
(SEQ ID NO 51) 



(SEQ ID NO 52) 
(SEQ ID NO 53) 



15 



Beads with about 25 pmol of immobilized primer were ligated 
to 3 pmol of digested TspR I 4>X174 DNA in 50 ^ containing 400 units of 
T4 DNA hgase (New England Biolabs; Beverly, MA), 50 mM Tris-HCl pH 
7.8, 10 mM MgCl, 10 mM dithiothreitol, I mM ATP and 25 pg/ml bovine 
serum albumin. Ligation reactions were performed at 37»C for 30 minutes 
at 50 »C to 55 "C for one hour (thermal ligase), at room temperamre for 2 
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hours or at 4 for overnight- After ligation, beads were washed twice with 
TE and resuspended in 300 ixl of the same buffer. 

Sequencing reactions: For each sequencing reaction, 30 jil of 
beads containing the ligation product was used. Beads were resuspended in 
5 a 13 |il volume containing 1.5 |il of 10 x Klenow buffer (100 mM Tris-HCl, 
pH 7.5, 50 mM MgClj and 75 nxM dithiothreitol), and with or without 1 ^1 
of single-stranded DNA binding protein (SSB, 5 ^ig/jil; USB; Cleveland, 
Ohio), Reaction mixtures were incubated on ice for 5 minutes, followed by 
the addition of 5 units of Klenow Fragment (New England Biolabs). The 

10 reaction volume was split into four termination mixes, each consisting of 1 
\xl DMSO plus 3 ^1 of the appropriate termination mix. Termination mixes 
were made in Klenow buffer and comprise the nucleotides concentrations 
shown in Table 1 1 . 

Termination mixtures were incubated for 20 minutes at 

1 5 ambient temperature. Two |il of a chase solution containing 0.5 mJM of each 
of the four dNTPs in Klenow buffer, was added to each reaction tube and 
mixtures were incubated for another 15 minutes at ambient temperature. 
Beads were precipitated by magnetic particle concentrator or centrifligation 
and the supernatant discarded. Precipitated beads were resuspended in TE 

20 or in a solution containing 10 jil deionized formamide, 5 mg/ml dextran blue 
and 0.1% SDS, and heated to 95'' C for 5 minutes. Mixtures were stored on 
ice for less than 10 minutes and analyzed by a DNA sequencing gel and on 
an ALF DNA sequencer (Pharmacia; Piscataway, NJ) using a 6% 
polyacrylamide gel with 7 M urea and 0.6 x TBE. 

25 One double stranded primer was used for each reaction and the 

results achieved using primers DF27.1, DF27-2. DF27.4, DF27-5.CY5 and 
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DF27-6-CY5, are shown in Figures 20. 21, 22, 23 and 24, respectively. 
Each primer was capable of generating sequencing infonnation of up to 200 
basepairs without significant interference from the 1 1 fragments with non- 
complementary ends. 

5 Other embodiments and uses of the invention will be apparent 

to those skilled in the art from consideration of the specification and practice 
of the invention disclosed herein. All U.S. Patents and other references 
noted herein are specifically incorporated by reference. The specification 
and examples should be considered exemplary only with the true scope and 
10 spirit ofthe invention indicated by the following claims. 
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of: 
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2. 

1 5 of: 



20 



A method for sequencing a target nucleic acid comprising the steps 

a) providing a set of nucleic acid fragments each containing a 
sequence that corresponds to a sequence of said target; 

b) hybridizing said set to an array of nucleic acid probes, 
wherein each probe comprises a double-stranded portion, a 
single-stranded portion and a variable sequence within said 
single-stranded portion, to form a target array of nucleic 
acids; 

c) determining molecular weights for a plurality of nucleic acids 
of said target array; and 

d) determining the sequence of said target nucleic acid. 

A method for sequencing a target nucleic acid comprising the steps 



b) 



25 d) 



a) providing a set of nucleic acid fragments each containing a 
sequence that corresponds to a sequence of said target; 
hybridizing said set to an array of nucleic acid probes, 
wherein each probe comprises a double-stranded portion, a 
single-stranded portion and a variable sequence within said 
single-stranded portion; 
c) creating a mass modified extended nucleic acid by extending 
and mass modifying a strand of the probe using the hybridized 
fragment as a template; 

determining molecular weights for a plurality of mass 
modified extended nucleic acids by mass spectrometry; and 
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4. 
of: 

20 a) 



e) detennining the sequence of said target nucleic acid. 

A method for sequencing a target nucleic acid comprising the steps 

a) providing a set of partially single-stranded nucleic acid 
fragments wherein each fragment contains a sequence that 
corresponds to a sequence of the target; 

b) hybridizing the single-stranded portions of the fragments to 
single-stranded portions of a set of partially double-stranded 
nucleic acid probes to form a set of complexes, and for each 
complex; 

0 ligating a single strand of the fragment to an adjacent 

single strand of the probe; and 
ii) extending the unligated strand of the complex by 

strand-displacement polymerization using the ligated 

strand as a template; and 

c) determining the sequence of the target. 
A method for sequencing a target nucleic acid comprising the steps 



25 



providing a set of nucleic acid fragments each containing a 
sequence which con-esponds to a sequence of said target; 
b) hybridizing said set of fragments to an array of mass modified 
probes, wherein each probe comprises a double-stranded 
portion, a single-stranded portion and a variable sequence 
within said single-stranded portion; 
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c) extending a strand of the mass modified probes using the 
hybridized fragments as templates; 

d) determining molecular weights for a plurality of extended 
mass modified strands; and 

e) determining the sequence of said target. 

A method for sequencing a target nucleic acid comprising the steps 



a) providing a set of partially single-stranded nucleic acid 
fragments wherein each fragment contains a sequence that 
corresponds to a sequence of the target; 

b) hybridizing the single-stranded portions of the fragments to 
single-stranded portions of a set of partially double-stranded 
nucleic acid probes to form a set of complexes, and for each 
complex; 

ligating a single strand of the fragment to an adjacent 
single strand of the probe; and 
n) extending the unligated strand of the complex by 
strand-displacement polymerization using the iigated 
strand as a template and mass-modifying the extended 
20 strand; 

c) determining the molecular weights of the extended strands by 
mass spectrometry; and 

d) determining the sequence of the target from the molecular 
weights of the extended strands. 

25 6. A method for sequencing a target nucleic acid comprising the steps 
of: 



wo 96/32504 



PCT/US96/05136 



10 7. 
of: 



15 



of: 



107 

a) providing a set of nucleic acids complementaiy to a sequence 
of said target; 

b) hybridizing said set to an array of single-stranded nucleic acid 
probes wherein each probe comprises a constant sequence and 
a variable sequence and said variable sequence is 
determinable; 

determining molecular weights of hybridized nucleic acids; 
and 

d) identifying the sequence of said target. 

A method for sequencing a target nucleic acid comprising the steps 



a) providing a set of nucleic acids homologous to a sequence of 
said target; 

b) hybridizing said set to an array of single-stranded nucleic acid 
probes wherein each probe comprises a constant sequence and 
a variable sequence; 

c) determining molecular weights of hybridized nucleic acids; 
and 

d) identifying the sequence of said target. 

20 8. A method for sequencing a target nucleic acid comprising the steps 



25 b) 



a) providing a set of partially single-stranded nucleic acid 
fragments wherein each fragment contains a sequence that 
corresponds to a sequence of the target; 
hybridizing the single-stranded portions of the fragments to 
single-stranded portions of a set of partially double-stranded 
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nucleic acid probes to form a set of complexes wherein each 
probe contains a variable sequence within the single-stranded 
region, and for each complex; 

i) ligating a single strand of the fragment to an adjacent 
5 single strand of the probe; and 

ii) extending the unligated strand of the complex by 
strand displacement polymerization using the ligated 
strand as a template; 

c) determining the molecular weights of the extended strands by 
10 mass spectrometry; and 

d) determining the sequence of the target from the molecular 
weights of the extended Strands. 

9. A method for sequencing a target nucleic acid comprising the steps 
of: 

15 a) providing a set of nucleic acid fragments each containing a 

sequence which corresponds to a sequence of said target; 

b) hybridizing said set to an array of nucleic acid probes, 
wherein each probe comprises a double-stranded portion, a 
single-stranded portion and a variable sequence within said 

20 singlC'Stranded portion; 

c) extending a strand of the probe enzymatically using the 
hybridized fragment as a template to create an extended 
nucleic acid; 

d) removing alkali cations from said extended nucleic acid; 

25 e) determining molecular weights for a plurality of protonated 

and extended nucleic acids by mass spectrometrv'; and 
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f) detennining the sequence of said target. 
A method for sequencing a target nucleic acid comprising the steps 

a) providing a set of nucleic acid fragments each containing a 
sequence which corresponds to a sequence of said target; 

b) hybridizing said set to an array of nucleic acid probes wherein 
each probe comprises a double-stranded portion, a single- 
stranded portion and a variable sequence within said single- 
stranded portion, to form a target array of nucleic acids; 
extending a strand of the probe using the hybridized fragment 
as a template; 

d) determining molecular weights for a plurality of nucleic acids 
of said target array; and 

e) determining the sequence of said target. 

A method for sequencing a target nucleic acid comprising the steps 

a) fragmenting a sequence of the target into nucleic acid 
fragments; 

b) hybridizing said fragments to an array of nucleic acid probes 
wherein each probe comprises a double-stranded portion, a 
single-stranded portion and a variable sequence within said 
single-stranded portion and said array is attached to a solid 
support; 

detennining molecular weights of hybridized fragments by 
mass spectrometry; 



c) 
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d) determming nucleotide sequences of the hybridized 
fragments; and 

e) identifying the sequence of said target 

12. The method of claims 1-11 wherein the target nucieic acid is obtained 
5 from a biological or recombinant source. 

13. The method of claims 1-11 wherein the target nucleic acid and the 
probe are each between about 10 to about 1,000 nucleotides in length. 

14. The method of claims 1-11 wherein the sequence is homologous with 
1 0 at least a portion of said target sequence. 

15. The method of claims l-Il wherein the sequence is complementary 
to at least a portion of said target sequence. 

16. The method of claims 1-11 wherein the set, the fragments or the 
probes are dephosphorylated by treatment with a phosphatase prior to 

15 hybridization. 

17. The method of claims 1-11 wherein the set or the fragments are 
created by enzymatically or physically cleaving said target, or by 
enzymatically replicating said target with chain terminating and chain 
elongating nucleotides, 

20 1 8. The method of claims 1-5 or 8-1 1 wherein the fragments comprise a 
nested set. 

19. The method of claims 1-1 1 wherein the target, the fragments and the 
probes comprise DNA, RNA, PNA or modifications or combinations 
thereof. 
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20. The method of claims 1-11 wherein the fragments are provided by 
synthesizing a complementary copy of the target sequence and fragmenting 
said target sequence by nuclease digestion. 

21. The method of claims 1-11 wherein the fragments are provided by 
enzymatically polymerizing complementary copies of said target with chain 
terminating and chain elongating nucleotides. 

22. The method of claims MI wherein the nucleic acid fragments 
comprise greater than about 10^ different members and each member is 
between about 1 0 to about 1 ,000 nucleotides in length. 

23 . The method of claims 1-11 wherein the set or the target fragments is 
provided by enzymatically polymerizing complementary copies of said 
target with chain terminating and chain elongating nucleotides. 

24. The method of claim 23 wherein enzymatic polymerization is a 
15 nucleic acid amplification process selected from the group consisting of 

strand displacement amplification, ligase chain reaction, Qp replicase 
amplification, 3SR amplification and polymerase chain reaction 
amplification. 

25. The method of claims 6 or 7 wherein the constant sequence is 
20 between about 3 to about 18 nucleotides in length. 

26. The method of claims 1-1 1 wherein the single-stranded portion of 
each probe contains a variable sequence of between about 4 to about 9 
nucleotides in length. 

27. The method of claims 1-1 1 wherein the fragments, the set of nucleic 
25 acids or the probes are attached to a solid support. 
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28. The method of claims 1-11 wherein each probe is between about 10 
to about 50 nucleotides in length, 

29. The method of claims 1-5 or 8-11 wherein the double-stranded 
regions of the probes contain the same sequence for each probe of the set. 

5 30, The method of claims 1-11 further comprising the step of ligating 
hybridized fragments to said probes. 

3 1 . The method of claims I - 1 1 further comprising the step of extending 
a strand of the probe using the hybridized fragment as a template wherein 

10 the extended strand displaces the hybridized fragment. 

32. The method of claim 31 wherein the extended strand comprises 
between about 0.1 femtomole to about 1.0 nanomole of nucleic acid. 

33. The method of claim 31 wherein the extended strand is between 
about 10 to about 100 nucleotides in length. 

1 5 34. The method of claims I- 1 1 wherein there are less than or equal to 4*^ 
different probes and R is the length in nucleotides of the variable sequence. 

35. The method of claim 27 wherein the solid support is selected from 
the group consisting of plates, beads, microbeads, whiskers, combs, 
hybridization chips, membranes, single crystals, ceramics and self- 

20 assembling monolayers, 

36. The method of claim 27 wherein the probes are conjugated with 
biotin or a biotin derivative and the solid support is conjugated with avidin, 
streptavidin or a derivative thereof. 

37. The method of claim 27 wherein the probes are attached lo said solid 
25 support by covalent bond, an electrostatic bond, a hydrogen bond, a 
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photocleavable bond, an electrostatic bond, a disulfide bond, a peptide bond, 
a diester bond, a selectively releasable bond or a combination thereof 
38- The method of claim 37 wherein the attachment is a cleavable 
attachment which is cleavable by heat, an enzyme, a chemical agent or 
5 electromagnetic radiation. 

39. The method of claim 38 wherein the chemical agent is selected from 
the group consisting of reducing agents, oxidizing agents, hydrolyzing 
agents and combinations thereof 

40. The method of claim 38 wherein the electromagnetic radiation is 
10 selected from the group consisting of visible, ultraviolet and infrared 

radiation, 

4L The method of claim 37 wherein the selectively releasable bond is 
4,4 -dimethoxytrityl or a derivative thereof 

42, The method of claim 41 wherein the derivative is selected from the 
15 group consisting of 3 or4 [bis-(4-methoxypheny!)]-methyl-benzoic acid, N- 
succinimidyl- 3 or 4 [bis-(4-methoxyphenyl)]-methyNben2oic acid, N- 
succinimidyl- 3 or 4 [bis-(4-methoxyphenyl)]-hydrox>TOethyl-benzoic acid, 
N-succinimidyl- 3 or 4 [bis-(4-methoxyphenyl)]-chloromethyl-benzoic acid 
and salts thereof 

20 43. The method of claim 27 further comprising a spacer between the 
probe and the solid support. 

44. The method of claim 43 wherein the spacer is selected from the group 
consistmg or oligopeptides, oligonucleotides, oligopolyamides, 
oligoethyleneglycerol, oligoacrylamides, alkyl chains of between about 6 to 
25 about 20 carbon atoms and combinations thereof 
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45. The method of claims 1, 6. 7 or 1 1 wherein the probe is extended 
using the hybridized strand as a template. 

46. The method of claims 2-5, 8-10 or 45 wherein extending comprise 
polymerization incorporating mass-modifying nucleotides into the extended 

5 strand. 

47. The method of claims 2-5, 8-10 or 45 wherein the strand is extended 
enzymatically using chain terminating and chain elongating nucleotides. 

48. The method of claims 2-5. 8-10 or 45 wherein a plurality of extended 
strands comprise about 0.1 femtomole to about 1.0 nanomole of nucleic 

10 acid. 



49. The method of claims 1 -5 or 8- 1 1 wherein the sequence is determined 
by polyacrylamide electrophoresis, capillaiy electrophoresis or rnass 
spectrometry. 

1 5 50. The method of claim 46 wherein the mass modified extended nucleic 
acid comprises between about 0.1 femtomole to about 1. 0 nanomole of 
nucleic acid. 

51 . The method of claim 46 wherein the mass modified extended nucleic 
acid is between about 10 to about 1 00 nucleotides in length. 
20 52. The method of claim 46 wherein the mass modified extended strand 
contains a plurality of mass modifying functionalities. 
53. The method of claims 1- 1 1 wherein the strand of said probe is mass 
modified by enzymatically extending said strand using a polymerase and a 
mass modified nucleotide. 

25 54, The method of claim 53 wherein the mass modified nucleotide 
chain elongating or chain terminating nucleotid 



IS a 
e. 
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55. 



The method of claim 53 wherein the mass modified nucleotide 
contains a plurality of mass modifying functionalities. 
56. The method of claim 53 wherein the mass modified probes contain 
a plurality of mass modifying functionalities. 
5 57. The method of claims 52, 55 or 56 wherein at least one mass 
modifying functionality is coupled to a heterocyclic base, a sugar moiety or 
a phosphate group. 

58. The method of claims 52, 55 or 56 wherein tiie mass modifj ing 
functionality is a chemical moiety that does not interfere wiUi hydrogen 

10 bonding for base-pau- formation. 

59. The method of claims 52, 55 or 56 wherein the mass modifying 
functionality is coupled to a purine at position C2, N3, N7 or C8 or a 
deazapurine at position N7 or C9. 

60. The method of claims 52, 55 or 56 wherein the mass modifying 
ftinctionahty is coupled to a pyrimidine at position C5 or C6. 

61. The method of claims 52, 55 or 56 wherein the mass modifying 
functionalit>' is selected from the group consisting of deuterium. F, CI, Br. 
I, SiR, Si(CH3)3, Si(CH3),(C,H,), Si(CH3),(QH si(CH )(^ , , 
Si(C,H,)3, (CH2XCH3, (CHANR, CH^CONR, (CH,)„OH, CH,F, CHF, and ' 

20 CF3; wherein n is an integer and R is selected from the group consisting of 
-H, deuterium and alkyls, alkoxys and aiyls of 1-6 carbon atoms, 
poiyoxymethylene, monoalkylated pofyoxymethylene. polyetiiylene imine, 
polyamide, polyester, alkylated silyl, heterooligo/polyaminoacid and 
polyethylene glycol. 

25 62. The metiiod of claims 52, 55 or 56 wherein the mass modifying 
flinctionalin' is generated from a precursor functionality which is -N3 or - 



15 
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XR> wherein X is selected from the group consisting of -0H» ~NH2, -NHR, 
-SH, -NCS, -OCO(CH2)„COOH, .NHCO(CH2)nCOOH, -OSO2OH, - 
OCO(CH2)J and -0P(0-alkyl).N.(alkyl)2, and n is an integer from 1 to 20; 
and R is selected from the group consisting of -H, deuterium and alky is, 
5 alkoxys and ary Is of 1 -6 carbon atoms, polyoxymethy lene, monoalky lated 
polyoxymethylene, polyethylene imine, polyamide, polyester, alkylated 
silyl, heterooligo/polyaminoacid and polyethylene glycol. 
63. The method of claims 1, 6, 7 or 1 1 wherein the hybridized nucleic 
acid fragment is extended, 
10 64, The method of claims 2, 3, 4, 5, 8-10 or 63 wherein the extended 
nucleic acid is mass modified by thiolation. 

65. The method of claim 64 wherein thiolation is performed by treating 
said extended strand with a Beaucage reagent. 
15 66. The method of claims 2, 3, 4, 5, 8-10 or 63 wherein the extended 
nucleic acid is mass modified by alkylation. 

67. The method of claim 66 wherein alkylation is performed by treating 
said extended strand with iodoacetamide, 

68. The method of claim 66 further comprising the step of removing 
20 alkali cations from said mass modified extended nucleic acid. 

69. The method of claim 68 wherein alkali cations are removed by ion 
exchange, 

70. The method of claim 69 wherein ion exchange comprises contacting 
said extended nucleic acid with a solution selected from the group consisting 

25 of ammonium acetate, ammonium carbonate, diammonium hydrogen citrate, 
ammonium tartrate and coriibinations thereof 
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71. 



-n.» ™«hod of dain,. 2. 5. 8. 9, 1 1 „ 49 wh«,i„ mass .peca„™e.ry 
mCudes a ..ease step selected fton, the g^up consisting of laser heating 
droplet .lease, electrical release, photochemical release and e.ectrospray ' 
72 The method of clain. 2. 5. 8. 9, 11 or 49 wherein mass spectrome,^ 
deludes an analytical step selected ion, the group consisting of Fourier 
Tt^fom,, ion cyclom>n resonance, time of High, analysis wid, reflection 
t.me of flight analysis wid,ou, reflection and quadmpole analysis 
73. -n« med,od of claims 2, 5, 8, 9, 1 1 or 49 wherein mass spectrometry 
.s performed by &s. atom bombardment, plasma desoT„ion. ma.rix-assis.ed 
laser deson,ti„n/io„iza.ion. ciectrospray. photochemical release, dectrica, 
release, drople. release, resonance ionization or a combination .hereof 
74 -n,e meftod of claims 2, 5, 8, 9, 1. or 49 wherein mass specrome^y 
.ncludes time of fligh. wid, reflection, time of fligh, without reflecion 
electtospray Fourier transfom,, ion .„p. „sonance ionization, ion cvCo.on 
i:> resonance or a combination thereof. 

'5. The method of cJaim«5 1 o a < n r\ 

ot Claims 1 , 2, 4. 6, 7. 9. 1 0 or 1 1 wherein nvo or more 
molecular weigh.s are determined simultaneously 

76. The method of daims 1. 2. 4. 6, 7. 9, 10 or , I wherein molecular 

20 ZT ™«-sis,ed laser deso.ptio„ ionization mass 

20 ^l»':'r°meOy and time of flight analysis. 

77. TT,e med,od of claims 1. 2, 4. 6. 7. 9, 10 or 1, wherein molecular 
we.gh,s are detemtined by electrospray ioniza.ion mass spectrometrv and 
quadrupole analysis. 

78. A med,od for detecting a target nucleic acid comprising ti,e steps of 

a) providing a se, of nucleic adds complen,en.ao- to a sequence 

of said target; 
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b) hybridizing said set to a fixed array of nucleic acid probes 
wherein each probe comprises a double-stranded portion, a 
single-stranded portion and a variable sequence within said 
single-stranded portion which is determinable; 
5 c) determining molecular weights of hybridized nucleic acids by 

mass spectrometry; and 
d) identifying a sequence of the target. 
79, A method for detecting a target nucleic acid comprising the steps of: 

a) providing a set of nucleic acids complementary to a sequence 
10 of said target; 

b) hybridizing said set to a fixed array of nucleic acid probes 
wherein each probe comprises a double-stranded portion, a 
single-stranded portion and a variable sequence within said 
single-stranded portion to form a target array of nucleic acids; 

15 c) mass modifying a plurality of nucleic acids of said target 

array; 

d) determining molecular weights of the mass modified nucleic 

't 

acids by mass spectrometry; and 

e) identifying a sequence of the target. 

20 80. The method of claims 78 or 79 wherein the target is provided from 
a biological sample. 

81. The method of claim 80 wherein the sample is obtained from a 
patient. 

82. The method of claims 78 or 79 wherein detection of the target is 
25 indicative of a disorder in the patient. 
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83. The method of claims 78 or 79 wherein the disorder is a genetic 
defect, a neoplasm or an infection. 

84. An array of nucleic acid probes wherein each probe comprises a first 
strand and a second strand wherein said first strand is hybridized to said 

5 second strand forming a double-stranded portion, a single-stranded portion 
and a variable sequence within said single-stranded portion, and said airay 
is attached to a solid support comprising a material that facilitates 
volatization of nucleic acids for mass spectrometry. 

85. An an-ay of single-stranded nucleic acid probes wherein each probe 
10 comprises a constant sequence and a variable sequence which is 

determinable, and said array is attached to a solid support comprising a 
matrix that facilitates volatization of nucleic acids for mass spectrometry. 

86. The array of claims 84 or 85 wherein the nucleic acid probes are mass 
modified nucleic acid probes. 

15 87. The array of claims 84 or 85 which contains less than or equal to 
about 4-^ differem probes and R is the length in nucleotides of the variable 
sequence. 

88. A kit for detecting a sequence of a target nucleic acid comprising an 
array of nucleic acid probes fixed to a solid support wherein each probe 
20 comprises a double-stranded portion, a single-stranded portion and a 
variable sequence within said single-stranded portion, and the solid support 
comprises a matrix chemical that facilitates volatization of nucleic acids for 
mass spectrometry. 

89. A kit for detecting a sequence of a target nucleic acid comprising an 
25 array of mass modified nucleic acid probes fixed to a solid suppon wherein 
each probe comprises a double-stranded portion, a single-stranded portion 
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and a variable sequence within said single-stranded portion, and the solid 
support comprises a matrix chemical that facilitates volatization of nucleic 
acids for mass spectrometry. 

90. A system for determining sequence information comprising a mass 
5 spectrometer, a computer and an array of mass modified nucleic acid probes 

wherein each probe comprises a single-stranded portion, an optional double- 
stranded portion and a variable sequence within said single-stranded portion, 
and wherein said array is attached to a solid support. 

91. A system for determining sequence information comprising a mass 
10 spectrometer, a computer and an array of nucleic acid probes wherein each 

probe comprises a single-stranded portion, an optional double-stranded 
portion and a variable sequence within said single-stranded portion, and 
wherein said array is attached to a solid support. 



1 / 34 



PCT/US96/05136 




n=i-5o 

M=H,OH,XR, 
FIG. I A Halogen. N3 



SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 



PCT/US96/05136 



2/34 




SUf?S"^r'JTE C'-'f^rr/RULE 26) 



wo 96/32504 



PCT/US96/05136 



3/3^ 




SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 



4/34 



•t 

PCT/US96/05136 




FIG. 2B 



SUBSTITUTE SHEET (RULE 26) 



Wo 96/32504 



5/34 



PCT/US96/05136 



-O-C-fCHpLc-O- 
0 0 

o 

"NH-C-/-C-NH- 
0 

NH.C.(CH2V-{f.O. 

0 o 



-NH-C-NH- 
II 

S 

-OP-o-Alkyl 

On 

-O-SOg-O- 



0-C-CH2-S 

II ^ 
0 



0 



-s- 



-NH- 



^' O, 1-200 
r = 1-20 



or 



or 



or 



or 



or - 



or 



or 



or 



or 



or 



-(CH2CH20)^-CH2CH2- OH 
-(CH2CH20)„,-CH2CH2-0-Alkyl 
-(CH2CH20}^.CH2CH2- OH 

-fCH2CH20)„,-CH2CH2-0-Alkyl 
-(CH2CH20)^-CH2CH2-OH 
-(CH2CH20)n,-CH2CH2-0-Alkyl 
-(CH2CH20)^-CH2CH2-OH 

-(CH2CH20)nfCH2CH2-O-Alkyf 
-(CH2CH20)^-CH2CH2- OH 
(CH2CH20)n,-CH2CH2-0-Alkyl 
-(CH2CH20)^-CH2CH2-OH 

(CH2CH20VCH2CH2-0>Alkyl 
-{CH2CH20)rn-CH2CH2-OH 

-(CH2CH20)^-CH2CH2-0-AIkyl 
-(CH2CH20)m-CH2CH2-0H 

-(CH2CH20)n,-CH2CH2-0-AIkyl 

-{CH2CH20)^CH2CH2- OH 

-(CH2CH20)ri^CH2CH2-0-Alkyl 
-(CH2CH20)^CH2CH2-0H 

-(CH2CH20)rnCH2CH2- 0-AIkyl 
-(CH2CH20}^CH2CH2-0H 
-{CH2CH20)rHCH2CH2-0-AIky} 



or - 



FIG. 3 

SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 



6 /3A 




rT» = 0, 1-200 
r = 1-20 



FIG. 4 



SUBSTITUTE SHEET (RULE 25) 



wo 96/32504 



PCT/US96/05136 



7/ 3^ 



^ TARGET 

f 

GCNNNNNNNGC 

CGNNNNNNNCG 

i 

TARGET ' 



FIG. 5 



5' 



3'NNGTCACNN 



PROBE 



5' 



3\ ^5' 



NNGAGTNN 



3'NNGTCACNN 



LIGATION 



PROBE + TARGET 



DNA POLYMERASE 



_^ TARGET 



5' 
3' 



NNCAGTGNN 
NNGTCACNN 



TARGET 



FIG. 6 



SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 

PCT/US96/05136 



8 / 3^ 



NUCLEIC ACID 
STRUCTURE 



I—' 



CALCULATED T^ CC, AVERAGE BASE COMPOSfTIGN) 
"= 8 7 6 n 



38 33 25 



33 25 15 



FIG. 7 



15 



15 3 -14 



51 46 40 31 



46 40 31 21 



40 31 21 II 



SUBSTITUTE SHEET (RULE 25) 



PCT/US96/05136 

9/3^ 



MASTER 
ARRAY 



8tv 

-4- 

4 



INCUBATE WITH BIOTINYLATEO 
COMPLEMENTARY STRAND 



8tv stv 



SYNTHESIS OF b ~~t 

COMPLEMENTARY ARRAY I <uil i f 



^ 4 (INTP«8 U 
CONTACT ABOVE T„ 3.5, 3,5^, 



STREPTAVIDIN-COATED FILTER OR 
OFFSETTING^' COMPRESSED BY 

b 

^- 

INCUBATE WITH stv 
COMPLEMENTARY STRANDS 



FULL, FINISHED ARRAY 



FIG. 8 



5' 3' 



-4- 



SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 

PCT/US96/05136 



T 10/34 

I 




SUBSTITUTE SHEET (RULE 26) 




wo 96/32504 



PCT/US96/05136 



n 




N 

s 
s 

\ 



s 

3 



I 



1 



COLD WASH 



I 



I 



HOT WASH 



n 



MATCHED PROBE 



I 



* LABEL 



FIG. 10 



5' 



N 



BIOriN 

STREPTAVIDIN 
SURFACE 



SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 



PCT/US96/05136 



12 / 3^ 



21 O 
S2 ^ 



lO «0 CO CO 




CD 

u. 



SlNnOO IVlOi/HSVM iOH 



wo 96/32504 



PCTAJS96/05136 



13/3^ 




wo 96/32504 



PCT/US96/05136 



U/ 3^ 








< 

ro 

CD 
Ll. 



SUBSTITUTE SHEET (RULE 26) 



CD 



wo 96/32504 





SUESTiTiJTF qHFPT iR]ii P 9f;» 



f 

wo 96/32504 



-1% 



PCT/US96/05136 



wo 96/32504 



PCT/US96/05136 



18 / 3^ 



50 ffler 






46 M«r 






42 M«r 

38 aer 
37 mtr 




12826.6 


33 Mr 







26 ner 
24 mar 

20 mtr 
19 m«r 



II mer 
10 Mr 



7 «8P 




SUBSTITUTE SHEET (RULE 26) 



W0 9d/325<M 



PCT/US96/05136 



19 /3A 



DMT-0 



NC 
1 

DMT 



NC 
2 



0 

p-P~o 

II 

s 





6 

THIOLATION 



DMT- 



0 

0-f!>-0 

II 



NC 
2 

DMT- 



^Lev 



2. DEPROTFCnnN 



0 



-[ 



NC 



Lev] 



0 

0-P-Ch 0 T 

SH Y_/ 

OH 



OMT-O-i^ 
0 

HS-P- 
II 

0 




OH 



0 




NH 2 

:3^Ay<YLATI0N 



DMT-0 



NH 



1^ 



OH 



DMT 



DIMETHOXYTRITYL-^ 

BIS(4-METH0XYPHENYL} ^ ^ i ^ 

PHENYLMDHYL- CH. 0-O+O^OCH, 

0 

I^mA^CH, 

T :THYMIN r.J^ Lev fLAEVULINYL-" 



0 



0* 



4-OXOPENIANOUL 




0 



FIG. 15 



SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 

PCT/US96/05136 

20/34 



s s 




0 

MASSImy 



1000 



1 r-j-T 

3000 5000 



' I — I 1 I i I 
8000 JIOOO 



FIG. 16 



SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 



PCT/US96/05 136 



21 / 34 



BLOCKING 



PGR 




BstXI DIGEST 

CCANNNNNNTGG 
GGTNNNNNNAOC 



NICK SITE 



DYNAL STREPTAVIDIN BEADS 



I 



I 



RESTRICTION DIGESTION 



DEPHOSPHORYLATION 
LIGATION TO PARTIALLY 
DUPLEX PROBE 



®5'| 



I 



0c. 



I 



DNA POLYMERASE (U\RGE FRAGMENT) 
ddNTP TERMINATION MIX 



5' -'-GATGATCCGACGCATCAGCTGTG 
3 • - CTACTAGGCTGCGTAGTCGp ACAC 



FIG. 17 



SUBSTITUTE SHEET {RULE 26) 



wo 96/32504 



PCT/US96/05136 



22/ 3^ 




< 
00 

o 



SUBSmUTE SHEET (RULE 26) 



* 

wo 96/32504 

PCT/US96/05I36 




SUBSTTTUTE SHEET (RULE 26) 



wo 96/32504 



PCT/US96/05136 



24/ 3^ 




RESTRICTION DIGESTION BY TspRI 
-^GENERATING 6 FRAGMENTS WITH TOTAL 
OF 12 DIFFERENT ENDS 




I 



NICK SITE 



— P 



DEPHOSPHORYUTION 

ONE KINO IMMOBILIZED PARTIALLY 
DUPLEX PROBE IS ADDED TO THE 
MIXTURE OF FRAGMENTS AND LIGATION 
REACTION IS PERFORMED 



I 



DNA POLYMERASE (URGE FRAGMENT) 
ddNTP TERMINATION MIX 



SEQUENCING PRODUCTS ARE APPLIED TO ALF DNA SEQUENCER 



FIG. 19 

SUBSTITUTE SHEET (RULE 26) 



i. 



WO 96/32504 



PCT/US96/05136 




25/34 



< 

o 



SUBSTITUTE SHEET (RULE 26) 




CD 
O 

CD 



PCT/US96/05136 



26/ 3^ 




o 

OJ 

S2 



SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 



PCTAJS96/05136 



27/3^ 








Og 


■ 


o 




< 


w 


a 




u 


- 


a 




o 




»— 


• 


o 




< 


-eg 


- 






< 




< 




CD 




o 








o 


- 


5, 






o 




o 




o 


- 


>- 


- 


a 




o 




< . 




< 




o 


-og 


: 


w 

o 








< 


— 


a 








o 






— 






< S 


- 


o 


- 


< 




< 




< 








< 


- 




- 


a 


- 


»— 








< 


- 


o 


- 


o 


■ 


o 












O 


- 


< 




o 


• 


8- 




o 




< 


• 


< 


- 


o 




o 




< 








< 






- 


o 


- 




>- 


o 




< 








< 




»— 








< 




US 


— 


< 


-* 






u 








a 


- 
■ 


NAT 




< - 



< 

CJ 

o 





I. L 




- ^ 



< 




- I- 

- o 

- y- 
< 
< 
U 

u 
< 
a 
o 



< 
< 



< 
< 

O 

o 



< 

- < 

- c 



< 

- o 

o 

- < o 

- O " 

o 
u 



J I- <^ 



r < 



5 



CVi 
CD 



SUBSTmjTE SHEET (RULE 26) 26) 



wo 96/32504 



PCTAJS96/05I36 



28/34 



< 
< 



< 



O 
CO 



£ 
o 

o 

o 
< 

< 

CO 

o 
a 

O 

c 

< 
< 

o 

< 
< 

o 

o ^ 
o 



O 

CVJ 

o 

Ll. 



o 



a 



o 
o 
< 

o 
o 

< 

o 

CD 
»— 

u 
o 
o 

< 

o 
o 
o 

o 
o 



a 



o 
cn 



o 
a 

o 



o 
O a> 

< 




SUBSTraiTE SHEET (RULE 26) 



wo 96/32504 PCT/US96/05136 




SUBSmUTE SHEET fRIll P •>c^ 



wo 96/32504 

PCT/US96/05136 




SUBSTITUTE SHEET (RULE 26) 



wo 96/32504 



PCTAJS96/05136 



31 /3A 




SUBSTTTUTE SHEET(RULE 26) 



wo 96/32504 



PCT/US96/05136 



32 / lU 




SUBSTTTUTE SHEET (RULE 26) 



wo 96/32504 

PCT/US96/051 36 




wo 96/32504 



PCTAJS96/05136 




SUBSTITUTE SHEET (RULE 26) 



WORLD INTCLLECTUAL PROPERTY ORGANIZATION 

International Bureau 



INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) Internadonal Patent Classification ^ : 

C12Q 1/68 


A3 


(11) International Publication Number: WO 96/32504 
(43) International Publication Date: 17 October 1996 (17.10.96) 


(21) International Application Number: PCT/U$96/05I36 

(22) International FUing Date: 10 April 1996 (10.04.96) 

(30) Priority Data: 

08/419.994 1 1 April 1995 (1 1.04.95) US 
08/420.009 1 1 April 1 995 ( 11 .04.95) US 
08/614,151 12 March 1996 (12.03.96) US 

(71) Applicant: TRUSTEES OF BOSTON UNIVERSITY [US/US]; 

147 Bay State Road, Boston, MA 02215 (US). 

(72) Inventors: CANTOR. Charles, R.; 1 1 Bay State Road #6, 

Boston. MA 02215 (US). KOSTER, Hubert; 1640 Monu- 
ment Street, Concord, MA 01742 (US). SMITH, Cassandra, 
L.; U Bay State Road #6, Boston, MA 02215 (US). FU, 
Dong-Jing; 44 Rosemont Avenue* Walton, MA 02154 (US), 

(74) Agents: REMENICK, James et al.; Baker & Botts, L.LJ>.. The 
Warner, 1299 Perunsylvania Avenue, N.W„ Washington. DC 
20004 (US). 


(81) Designated States: AL, AM, AT, AU, AZ, BB, BG, BR, BY, 
CA, CH, CN, CZ, DE, DK, EE, ES, FI, GB, GE, HU, IS, 
JP, KE, KG, KP, KR, K2, LK, LR, LS, LT. LU, LV. MD, 
MG, MK, MN. MW, MX, NO, NZ. PL, PT. RO. RU, SD. 
SE. SG. SI, SK, TJ, TM, TR. TT, UA, UG. UZ, VN. ARIPO 
patent (KE, LS, MW, SD, SZ, UG), Eurasian patent (AM, 
AZ. BY, KG, KZ, MD, RU. TJ, TM), European patent (AT, 
BE, CH. DE, DK. ES. H, FR, GB. GR, IE, IT, LU, MC, 
NL. PT, SE), OAPI patent (BF, BJ, CF, CG, CI, CM. OA, 
GN. ML, MR, NE, SN, TD. TO). 

Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be repttbUshed in the event of the receipt of 
amendments. 

(SS) Date of publication of the international search report: 

14 November 1996 (14.11.96) 



(54) Title: SOLID PHASE SEQUENCING OF BIOPOLYMERS 



(57) Abstract 

This invention relates to methods for detecting and se- 
quencing target nucleic acid sequences, and double -stranded 
nucleic acid sequences, to nucleic acid probes, to mass mod- 
ified nucleic acid probes, to arrays of probes useful in these 
methods and to kits and systems which contain these probes. 
Useful methods involve hybridizing the nucleic acids or nu- 
cleic acids which represent complementary or homologous 
sequences of the target to an array of nucleic acid probes. 
These probes comprise a single-stranded portion, an optional 
double-stranded portion and a variable sequence within the 
single-stranded portion. The molecular weights of the hy- 
bridized nucleic acids of the set can be determined by mass 
spectroscopy, and the sequence of the target determined from 
the molecular weights of the fragments. Nucleic acids whose 
sequences can be determined include DNA or RNA in bio- 
logical samples such as patient biopsies and environmental 
samples. Probes may be fixed to a solid support such as a 
hybridization chip to facilitate automated molecular weight 
analysis and identification of the target sequence. 




RESTRIGHON DIGESTION BY TspRI 
-^GENERATING 6 FRAGMENTS WITH TOTAL 
OF 12 OIFFERENT ENDS 




OEPHOSPHORYIATION 

ONE KINO IMMOBILIZED PARTIAaY 
OUPIEX PROBE IS ADDED TO THE 
MIXTURE OF FRAGMENTS ANO UGATION 
REACTION IS PERFORMED 



I®- 



NICK SITE 



DNA POLYMERASE (LARGE FRAGMENT) 
(WMTP TERMINATION MIX 



f 

SEQUENCING PRODUCTS ARE APPUEO TO AlF DNA SEQUENCER 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States paity to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AM 


Anneota 


GB 


United Kingdom 


MW 


Malawi 


AT 


Austria 


GE 


Gtiorjgia 


MX 


Mexico 


AU 


Aiutralia 


GN 


Guinea 


NE 


Niger 


BB 




GR 


Greece 


NL 


Netherlands 


BE 


Belgium 


KU 


Hungary 


NO 


Norway 


BF 


Buridna Faso 


IE 


Ireland 


NZ 


New Zealand 


BG 


Bulgaria 


rr 


Italy 


PL 


Poland 


BJ 


Benin 


jp 


Ja||>an 


FT 


Portugal 


BR 


Brazil 


KE 


Kenya 


RO 


Romania 


BV 


Belarus 


KG 


Kyrgyatan 


RU 


Russian Federation 


CA 


Canada 


KP 


Democratic People's Republic 


SD 


Sudan 


CF 


CentnU A&tcan RepubUc 




of Korea 


SE 


Sweden 


CG 


Congo 


KR 


Republic of Korea 


SG 


Sfcgapon 


CH 


Switzerland 


KZ 


Kazakhstan 


SI 


Slovenia 


CI 


COCe d*Ivoire 


U 


Liechtenstein 


SK 


Slovakia 


CM 


Cameroon 


LK 


Sri Lanka 


SN 


Senegal 


CN 


China 


LR 


Liberia 


sz 


Swaziland 


cs 


Czechoslovakia 


LT 


Lithuania 


TD 


Chad 


cz 


Czech Republic 


LU 


Luxembourg 


TG 


Togo 


DE 


Germany 


LV 


Latvia 


TJ 


TajiVtsus 


DK 


Denmaxic 


MC 


Monaco 


TT 


Trinidad and Tobago 


EE 


Estooia 


MD 


Republic of Moldova 


UA 


UkxitiM 


ES 




M6 


Madagascar 


UG 


Uganda 


FI 


Finland 


ML 


M«li 


US 


United State* of America 


FR 


France 


MN 


Mongolia 


uz 


Uzbeksttan 


GA 


Gabon 


MR 


Mniritattia 


VN 


Viet Nam 



INTERNATIONAL SEARCH REPORT 



tnte onal Application No 

PCT/US 96/05136 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C12Q1/68 



According to International Patent qaaification (IPQ or to both national classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (dassificaaon system followed by classification symbols) 

IPC 5 C12Q 



Documcntauon searched other than minimum documentation to the extent that such documents arc included in the fields searched 



Electronic data base consulted dunng ttic mtemanonal search (name of data ba."se and, where practical* search tcnns used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category * 



Qtation of document, with indication, where appropriate, of the relevant passages 



wo, A, 94 11530 (UNIV BOSTON) 26 May 1994 
see the whole document 

PROCEEDINGS OF THE NATIONAL ACADEMY OF 
SCIENCES OF USA, 

vol. 91. April 1994, WASHINGTON US, 

pages 3Q72-3076, XPO02O13927 

BROUDE ET AL. : "Enhanced DNA sequencing 

by hybridisation" 

see the whole document 

W0.A,94 16101 (KOSTER) 21 July 1994 
see the whole document 



Relevant to claim No. 



1-91 



1-91 



1-91 



m 



Further documents are listed in the continuation of box C. 



Patent fannily members are listed in annex. 



* Spcdal categories of cited documents : 

'A* document defining the general state of the art which is not 
considered to be of particular relevance 

*E* earlier document but published on or after the international 
filing date 

'L* document which may throw doubts on priority dainr^s) or 
which is cited to establish the publication date of another 
dtatxon or other special reason (as spectHcd) 

*0* document referring to an oral disdosurei use* exhibition or 
other means 

'P' document published prior to the intemationaJ Hting date but 
later than the prionty date claimed 



later document published after the international filing date 
or priority date and not in coniltct with the application but 
dted to understand the principle or theory underlying the 
invention 

•X* document of particular relevance; the daimed invention 
cannot be considered novd or cannot be considered to 
involve an inventive step when the document is taken alone 

"Y' document of particular relevance; the daimed invention 
cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

'Jt* document member of the same patent family 



Date of the actual completion of the international search 

20 September 1996 


Date of mailing of the international search report 

0 2. 10, 98 


Name and mailing address of the ISA 

European Patent Omcc, P.B, 5818 Patwitiaan 2 
NL - 2280 HV Rijswijlc 
Tel. ( + 3N70) 340-2040, Tx. 31 651 epo nl. 
Fax: ( + 31*70) 340-3016 


Authorized officer 

Molina Galan, E 



Forni PCT/IS A/310 (second thcel) (July 199a} 



page 1 of 2 



II^TTERNATIONAL SEARCH REPORT 

CtCenonuaocn) DO CUMENTS CONSIDERED TO BE RELEVANT 
Categoty ' j aianon of doeumem, with indicauon, where appropnaie, of the relevant passa^ 

J BIOMOLEC STRUCT DYNAM, 
vol. 11, no. 4, February 1994, 
pages 797-812, XP0OQ602231 
USOVETAL. : "DNA sequencing by 
hybridisation to oligonucleotide matrix. 
Calculation of continuous stacking 
hybridisation efficiency" 
see the whole document 

CHIMICA06GI. 
October 1991, 
pages 13-16, XP0O2013928 
W. BAINS: "DNA sequencing by mass 
spectrometry. Outline of a potential 
future application" 

DNA SEQUENCE, (1991)'r(6) 375-88. , 
XP00O391524 

KHRAPKO ET AL.: "A method for DNA 
sequencing by hybridization with 
oligonucleotide matrix" 
cited in the application 

EP,A,0 630 972 (HITACHI LTD) 28 December 
FEBS LETTERS, 

vol. 256, no. 1 - 02, 9 October 1989, 
pages 118-122, XPO00304574 
KHRAPKO K R ET AL: "AN OLIGONUCLEOTIDE 
HYBRIDIZATION APPROACH TO DNA SEQUENCING" 
cited in the application 

P,X I GENETIC ANALYSIS, 

vol. 12, January 1996, 
pages 137-142. XPO0O6O2230 
FU ET AL.: "Efficient preparation of 
short DNA sequence ladders potentially 
suitable for MALDI-TOF DNA sequencing" 
see the whole document 



Incs' anal Application No 

PCT/US 95/05136 



Relevant to claim No. 



1-91 



1-91 



Form PCT/ISA/aiO (coAtinu«ttoA of tecnnd iliMt) (July 



page 2 of 



2 



INTERNATIONAL SEARCH REPORT 

.ironiutiott on patent Cuntly members 



Patent document 
cited in search report 



WO-A-9411530 



W0-A-94161G1 



EP-A-0630972 



Publication 
date 



26-05-94 



21-07-94 



28-12-94 



inter onat Application No 

PCT/US 96/05136 



Patent family 
member(s) 



Publication 
date 



EP-A- 


0668932 


30 


-08 


-95 


JP-T- 


8507199 


06 


-08 


-95 


US-A- 


5503980 


02 


-04 


-96 


AU-A- 


5992994 


15 


-08 


-94 


CA-A- 


2153387 


21 


-07- 


-94 


EP-A- 


0679196 


02 


-II- 


-95 


US-A- 


5547835 


20. 


-08 • 


-96 


JP-A- 


7008300 


13- 


.01. 


-95 


JP-A- 


7G39399 


10. 


-02- 


-95 



Focm r>CT/lSA/210 (pttant family mo«x) (July 1992) 



